System for the in vivo delivery and expression of heterologous genes in the bone marrow

ABSTRACT

The present invention provides a method of delivering immunogenic or therapeutic proteins to bone marrow cells using alphavirus vectors. The alphavirus vectors disclosed herein target specifically to bone marrow tissue, and viral genomes persist in bone marrow for at least three months post-infection. No or very low levels of virus were detected in quadricep, brain, and sera of treated animals. The sequence of a consensus Sindbis cDNA clone, pTR339, and infectious RNA transcripts, infectious virus particles, and pharmaceutical formulations derived therefrom are also disclosed. The sequence of the genomic RNA of the Girdwood S.A. virus, and cDNA clones, infectious RNA transcripts, infectious virus particles, and pharmaceutical formulations derived therefrom are also disclosed.

RELATED APPLICATION INFORMATION

This application is filed under 35 U.S.C. §371 of PCT Application No.PCT/US98/02945, filed on Feb. 18, 1999, the disclosure of which isincorporated by reference herein in its entirety, which is acontinuation-in-part of co-pending U.S. application Ser. No. 08/801,263,filed on Feb. 19, 1997, which issued as U.S. Pat. No. 5,811,407, thedisclosure of which is incorporated by reference herein in its entirety.

FEDERALLY SPONSORED RESEARCH

This invention was made with Government support under Grant Number 5 RO1AI22186 from the National Institutes of Health. The Government hascertain rights to this invention.

FIELD OF THE INVENTION

The present invention relates to recombinant DNA technology, and inparticular to introducing and expressing foreign DNA in a eukaryoticcell.

BACKGROUND OF THE INVENTION

The Alphavirus genus includes a variety of viruses all of which aremembers of the Togaviridae family. The alphaviruses include EasternEquine Encephalitis virus (EEE), Venezuelan Equine Encephalitis virus(VEE), Everglades virus, Mucambo virus, Pixuna virus, Western EquineEncephalitis virus (WEE), Sindbis virus, South African Arbovirus No. 86(S.A.AR86), Girdwood S.A. virus, Ockelbo virus, Semliki Forest virus,Middelburg virus, Chikungunya virus, O'Nyong-Nyong virus, Ross Rivervirus, Barmah Forest virus, Getah virus, Sagiyama virus, Bebaru virus,Mayaro virus, Una virus, Aura virus, Whataroa virus, Babanki virus,Kyzylagach virus, Highlands J virus, Fort Morgan virus, Ndumu virus, andBuggy Creek virus.

The alphavirus genome is a single-stranded, messenger-sense RNA,modified at the 5′-end with a methylated cap, and at the 3′-end with avariable-length poly (A) tract. The viral genome is divided into tworegions: the first encodes the nonstructural or replicase proteins(nsP1-nsP4) and the second encodes the viral structural proteins.Strauss and Strauss, Microbiological Rev. 58, 491-562, 494 (1994).Structural subunits consisting of a single viral protein, C, associatewith themselves and with the RNA genome in an icosahedral nucleocapsid.In the virion, the capsid is surrounded by a lipid envelope covered witha regular array of transmembranal protein spikes, each of which consistsof a heterodimeric complex of two glycoproteins, E1 and E2. See Paredeset al., Proc. Natl. Acad. Sci. USA 90, 9095-99 (1993); Paredes et al.Virology 187, 324-32 (1993); Pedersen et al., J. Virol. 14:40 (1974).

Sindbis virus, the prototype member of the alphavirus genus of thefamily Togaviridae, and viruses related to Sindbis are broadlydistributed throughout Africa, Europe, Asia, the Indian subcontinent,and Australia, based on serological surveys of humans, domestic animalsand wild birds. Kokemot et al., Trans. R. Soc. Trop Med. Hyg. 59, 553-62(1965); Redaksie, S. Aft. Med. J. 42, 197 (1968); Adekolu-John andFagbami, Trans. R. Soc. Trop. Med. Hyg. 77, 149-51 (1983); Darwish etal., Trans. R. Soc. Trop. Med. Hyg. 77, 442-45 (1983); Lundström et al.,Epidemiol. Infect. 106, 567-74 (1991); Morrill et al., J. Trop. Med.Hyg. 94, 166-68 (1991). The first isolate of Sindbis virus (strainAR339) was recovered from a pool of Culex sp. mosquitoes collected inSindbis, Egypt in 1953 (Taylor et al., Am. J. Trop. Med. Hyg. 4, 844-62(1955)), and is the most extensively studied representative of thisgroup. Other members of the Sindbis group of alphaviruses include SouthAfrican Arbovirus No. 86, Ockelbo82, and Girdwood S.A. These viruses arenot strains of the Sindbis virus; they are related to Sindbis AR339, butthey are more closely related to each other based on nucleotide sequenceand serological comparisons. Lundström et al., J. Wildl. Dis. 29, 189-95(1993); Simpson et al., Virology 222, 464-69 (1996). Ockelbo82, S.A.AR86and Girdwood S.A. are all associated with human disease, whereas Sindbisis not. The clinical symptoms of human infection with Ockelbo82,S.A.AR86, or Girdwood S.A. are a febrile illness, general malaise,macropapular rash, and joint pain that occasionally progresses to apolyarthralgia sometimes lasting from a few months to a few years.

The study of these viruses has led to the development of beneficialtechniques for vaccinating against the alphavirus diseases, and otherdiseases through the use of alphavirus vectors for the introduction offoreign DNA. See U.S. Pat. No. 5,185,440 to Davis et al., and PCTPublication WO 92/10578. It is intended that all United States patentreferences be incorporated in their entirety by reference.

It is well known that live, attenuated viral vaccines are among the mostsuccessful means of controlling viral disease. However, for some viruspathogens, immunization with a live virus strain may be eitherimpractical or unsafe. One alternative strategy is the insertion ofsequences encoding immunizing antigens of such agents into a vaccinestrain of another virus. One such system utilizing a live VEE vector isdescribed in U.S. Pat. No. 5,505,947 to Johnston et al.

Sindbis virus vaccines have been employed as viral carriers in virusconstructs which express genes encoding immunizing antigens for otherviruses. See U.S. Pat. No. 5,217,879 to Huang et al. Huang et al.describes Sindbis infectious viral vectors. However, the reference doesnot describe the cDNA sequence of Girdwood S.A. and TR339, nor clones orviral vectors produced therefrom.

Another such system is described by Hahn et al., Proc. Natl. Acad. Sci.USA 89:2679 (1992), wherein Sindbis virus constructs which express atruncated form of the influenza hemagglutinin protein are described. Theconstructs are used to study antigen processing and presentation invitro and in mice. Although no infectious challenge dose is tested, itis also suggested that such constructs might be used to produceprotective B- and T-cell mediated immunity.

London et al., Proc. Natl. Acad; Sci, USA 89, 207-11 (1992), disclose amethod of producing an immune response in mice against a lethal RiftValley Fever (RVF) virus by infecting the mice with an infectiousSindbis virus containing an RVF epitope. London does not disclose usingGirdwood S.A. or TR339 to induce an immune response in animals.

Viral carriers can also be used to introduce and express foreign DNA ineukaryotic cells. One goal of such techniques is to employ vectors thattarget expression to particular cells and/or tissues. A current approachhas been to remove target cells from the body, culture them ex vivo,infect them with an expression vector, and then reintroduce them intothe patient.

PCT Publication No. WO 92/10578 to Garoff and Liljestrom provide asystem for introducing and expressing foreign proteins in animal cellsusing alphaviruses. This reference discloses the use of Semliki Forestvirus to introduce and express foreign proteins in animal cells. The useof Girdwood S.A. or TR339 is not discussed. Furthermore, this referencedoes not provide a method of targeting and introducing foreign DNA intospecific cell or tissue types.

Accordingly, there remains a need in the art for full-length cDNA clonesof positive-strand RNA viruses, such as Girdwood S.A and TR339. Inaddition, there is an ongoing need in the art for improved vaccinationstrategies. Finally, there remains a need in the art for improvedmethods and nucleic acid sequences for delivering foreign DNA to targetcells.

SUMMARY OF THE INVENTION

A first aspect of the present invention is a method of introducing andexpressing heterologous RNA in bone marrow cells, comprising: (a)providing a recombinant alphavirus, the alphavirus containing aheterologous RNA segment, the heterologous RNA segment comprising apromoter operable in bone marrow cells operatively associated with aheterologous RNA to be expressed in bone marrow cells; and then (b)contacting the recombinant alphavirus to the bone marrow cells so thatthe heterologous RNA segment is introduced and expressed therein.

As a second aspect, the present invention provides a helper cell forexpressing an infectious, propagation defective, Girdwood S.A. virusparticle, comprising, in a Girdwood S.A.-permissive cell: (a) a firsthelper RNA encoding (i) at least one Girdwood S.A. structural protein,and (ii) not encoding at least one other Girdwood S.A. structuralprotein; and (b) a second helper RNA separate from the first helper RNA,the second helper RNA (i) not encoding the at least one Girdwood S.A.structural protein encoded by the first helper RNA, and (ii) encodingthe at least one other Girdwood S.A. structural protein not encoded bythe first helper RNA, and with all of the Girdwood S.A. structuralproteins encoded by the first and second helper RNAs assembling togetherinto Girdwood S.A. particles in the cell containing the replicon RNA;and wherein the Girdwood S.A. packaging segment is deleted from at leastthe first helper RNA.

A third aspect of the present invention is a method of makinginfectious, propagation defective, Girdwood S.A. virus particles,comprising: transfecting a Girdwood S.A.-permissive cell with apropagation defective replicon RNA, the replicon RNA including theGirdwood S.A. packaging segment and an inserted heterologous RNA;producing the Girdwood S.A. virus particles in the transfected cell; andthen collecting the Girdwood S.A. virus particles from the cell. Alsodisclosed are infectious Girdwood S.A. RNAs, cDNAs encoding the same,infectious Girdwood S.A. virus particles, and pharmaceuticalformulations thereof.

As a fourth aspect, the present invention provides a helper cell forexpressing an infectious, propagation defective, TR339 virus particle,comprising, in a TR339-permissive cell: (a) a first helper RNA encoding(i) at least one TR339 structural protein, and (ii) not encoding atleast one other TR339 structural protein; and (b) a second helper RNAseparate from the first helper RNA, the second helper RNA (i) notencoding the at least one TR339 structural protein encoded by the firsthelper RNA, and (ii) encoding the at least one other TR339 structuralprotein not encoded by the first helper RNA, and with all of the TR339structural proteins encoded by the first and second helper RNAsassembling together into TR339 particles in the cell containing thereplicon RNA; and wherein the TR339 packaging segment is deleted from atleast the first helper RNA.

A fifth aspect of the present invention is a method of makinginfectious, propagation defective, TR339 virus particles, comprising:transfecting a TR339-permissive cell with a propagation defectivereplicon RNA, the replicon RNA including the TR339 packaging segment andan inserted heterologous RNA; producing the TR339 virus particles in thetransfected cell; and then collecting the TR339 virus particles from thecell. Also disclosed are infectious TR339 RNAs, cDNAs encoding the same,infectious TR339 virus particles, and pharmaceutical formulationsthereof.

As a sixth aspect, the present invention provides a recombinant DNAcomprising a cDNA coding for an infectious Girdwood S.A. virus RNAtranscript, and a heterologous promoter positioned upstream from thecDNA and operatively associated therewith. The present invention alsoprovides infectious RNA transcripts encoded by the above-mentioned cDNAand infectious viral particles containing the infectious RNAtranscripts.

As a seventh aspect, the present invention provides a recombinant DNAcomprising a cDNA coding for a Sindbis strain TR339 RNA transcript, anda heterologous promoter positioned upstream from the cDNA andoperatively associated therewith. The present invention also providesinfectious RNA transcripts encoded by the above-mentioned cDNA andinfectious viral particles containing the infectious RNA transcripts.

The foregoing and other aspects of the present invention are describedin the detailed description set forth below.

DETAILED DESCRIPTION OF THE INVENTION

The production and use of recombinant DNA, vectors, transformed hostcells, selectable markers, proteins, and protein fragments by geneticengineering are well-known to those skilled in the art. See, e.g., U.S.Pat. No. 4,761,371 to Bell et al. at Col. 6 line 3 to Col. 9 line 65;U.S. Pat. No. 4,877, 729 to Clark et al. at Col. 4 line 38 to Col. 7line 6; U.S. Pat. No. 4,912,038 to Schilling at Col 3 line 26 to Col 14line 12; and U.S. Pat. No. 4,879,224 to Wallner at Col. 6 line 8 to Col.8 line 59.

The term “alphavirus” has its conventional meaning in the art, andincludes the various species of alphaviruses such as Eastern EquineEncephalitis virus (EEE), Venezuelan Equine Encephalitis virus (VEE),Everglades virus, Mucambo virus, Pixuna virus, Western Encephalitisvirus (WEE), Sindbis virus, South African Arbovirus No. 86, GirdwoodS.A. virus, Ockelbo virus, Semliki Forest virus, Middelburg virus,Chikungunya virus, O'Nyong-Nyong virus, Ross River virus, Barmah Forestvirus, Getah virus, Sagiyama virus, Bebaru virus, Mayaro virus, Unavirus, Aura virus, Whataroa virus, Babanki virus, Kyzlagach virus,Highlands J virus, Fort Morgan virus, Ndumu virus, Buggy Creek virus,and any other virus classified by the International Committee onTaxonomy of Viruses (ICTV) as an alphavirus. The preferred alphavirusesfor use in the present invention include Sindbis virus strains (e.g.,TR339), Girdwood S.A., S.A.AR86, and Ockelbo82.

An “Old World alphavirus” is a virus that is primarily distributedthroughout the Old World. Alternately stated, an Old World alphavirus isa virus that is primarily distributed throughout Africa, Asia, Australiaand New Zealand, or Europe. Exemplary Old World viruses include SF groupalphaviruses and SIN group alphaviruses. SF group alphaviruses includeSemliki Forest virus, Middelburg virus, Chikungunya virus, O'Nyong-Nyongvirus, Ross River virus, Barmah Forest virus, Getah virus, Sagiyamavirus, Bebaru virus, Mayaro virus, and Una virus. SIN group alphavirusesinclude Sindbis virus, South African Arbovirus No. 86, Ockelbo virus,Girdwood S.A. virus, Aura virus, Whataroa virus, Babanki virus, andKyzylagach virus.

Acceptable alphaviruses include those containing attenuating mutations.The phrases “attenuating mutation” and “attenuating amino acid,” as usedherein, mean a nucleotide sequence containing a mutation, or an aminoacid encoded by a nucleotide sequence containing a mutation, whichmutation results in a decreased probability of causing disease in itshost (i.e., a loss of virulence), in accordance with standardterminology in the art, whether the mutation be a substitution mutationor an in-frame deletion mutation. See, e.g., B. DAVIS ET AL.,MICROBIOLOGY 132 (3d ed. 1980). The phrase “attenuating mutation”excludes mutations or combinations of mutations which would be lethal tothe virus.

Appropriate attenuating mutations will be dependent upon the alphavirusused. Suitable attenuating mutations within the alphavirus genome willbe known to those skilled in the art. Exemplary attenuating mutationsinclude, but are not limited to, those described in U.S. Pat. No.5,505,947 to Johnston et al., copending U.S. application Ser. No.08/448,630 to Johnston et al., and copending U.S. application Ser. No.08/446,9 32 to Johnston et al. it is intended that all United Statespatent references be incorporated in their entirety by reference.

Attenuating mutations may be introduced into the RNA by performingsite-directed mutagenesis on the cDNA which encodes the RNA, inaccordance with known procedures. See, Kunkel, Proc. Natl. Acad. Sci.USA 82, 488 (1985), the disclosure of which is incorporated herein byreference in its entirety. Alternatively, mutations may be introducedinto the RNA by replacement of homologous restriction fragments in thecDNA which encodes for the RNA, in accordance with known procedures.

I. Methods for Introducing and Expressing Heterologous RNA in BoneMarrow Cells

The present invention provides methods of using a recombinant alphavirusto introduce and express a heterologous RNA in bone marrow cells. Suchmethods are useful as vaccination strategies when the heterologous RNAencodes an immunogenic protein or peptide. Alternatively, such methodsare useful in introducing and expressing in bone marrow cells an RNAwhich encodes a desirable protein or peptide, for example, a therapeuticprotein or peptide.

The present invention is carried out using a recombinant alphavirus tointroduce a heterologous RNA into bone marrow cells. Any alphavirus thattargets and infects bone marrow cells is suitable. Preferredalphaviruses include Old World alphaviruses, more preferably SF groupalphaviruses and SIN group alphaviruses, more preferably Sindbis virusstrains (e.g., TR339), S.A.AR86 virus, Girdwood S.A. virus, and Ockelbovirus. In a more preferred embodiment, the alphavirus contains one ormore attenuating mutations, as described hereinabove.

Two types of recombinant virus vector are contemplated in carrying outthe present invention. In one embodiment employing “double promotervectors,” the heterologous RNA is inserted into a replication andpropagation competent virus. Double promoter vectors are described inU.S. Pat. No. 5,505,947 to Johnston et al. With this type of viralvector, it is preferable that heterologous RNA sequences of less than 3kilobases are inserted into the viral vector, more preferably those lessthan 2 kilobases, and more preferably still those less than 1 kilobase.In an alternate embodiment, propagation-defective “replicon vectors,” asdescribed in copending U.S. application Ser. No. 08/448,630 to Johnstonet al., will be used. One advantage of replicon viral vectors is thatlarger RNA inserts, up to approximately 4-5 kilobases in length can beutilized. Double promoter vectors and replicon vectors are described inmore detail hereinbelow.

The recombinant alphaviruses of the claimed method target theheterologous RNA to bone marrow cells, where it expresses the encodedprotein or peptide. Heterologous RNA can be introduced and expressed inany cell type found in the bone marrow. Bone marrow cells that may betargeted by the recombinant alphaviruses of the present inventioninclude, but are not limited to, polymorphonuclear cells, hemopoieticstem cells (including megakaryocyte colony forming units (CFU-M), spleencolony forming units (CFU-S), erythroid colony forming units (CFU-E),erythroid burst forming units (BFU-E), and colony forming units inculture (CFU-C), erythrocytes, macrophages (including reticular cells),monocytes, granulocytes, megakaryoctyes, lymphocytes, fibroblasts,osteoprogenitor cells, osteoblasts, osteoclasts, marrow stromal cells,chondrocytes and other cells of synovial joints. Preferably, marrowcells within the endosteum are targeted, more preferably osteoblasts.Also preferred are methods in which cells in the endosteum of synovialjoints (e.g., hip and knee joints) are targeted.

By targeting to the cells of the bone marrow, it is meant that theprimary site in which the virus will be localized in vivo is the cellsof the bone marrow. Alternately stated, the alphaviruses of the presentinvention target bone marrow cells, such that titers in bone marrow twodays after infection are greater than 100 PFU/g crushed bone, preferablygreater than 200 PFU/g crushed bone, more preferably greater than 300PFU/g crushed bone, and more preferably still greater than 500 PFU/gcrushed bone. Virus may be detected occasionally in other cell or tissuetypes, but only sporadically and usually at low levels. Viruslocalization in the bone marrow can be demonstrated by any suitabletechnique known in the art, such as in situ hybridization.

Bone marrow cells are long-lived and harbor infectious alphaviruses fora prolonged period of time, as demonstrated in the Examples below. Thesecharacteristics of bone marrow cells render the present invention usefulnot only for the purpose of supplying a desired protein or peptide toskeletal tissue, but also for expressing proteins or peptides in vivothat are needed by other cell or tissue types.

The present invention can be carried out in vivo or with cultured bonemarrow cells in vitro. Bone marrow cell cultures include primarycultures of bone marrow cells, serially-passaged cultures of bone marrowcells, and cultures of immortalized bone marrow cell lines. Bone marrowcells may be cultured by any suitable means known in the art.

The recombinant alphaviruses of the present invention carry aheterologous RNA segment. The heterologous RNA segment encodes apromoter and an inserted heterologous RNA. The inserted heterologous RNAmay encode any protein or a peptide which is desirably expressed by thehost bone marrow cells. Suitable heterologous RNA may be of prokaryotic(e.g., RNA encoding the Botulinus toxin C), or eukaryotic (e.g., RNAencoding malaria Plasmodium protein cs1) origin. Illustrative proteinsand peptides encoded by the heterologous RNAs of the present inventioninclude hormones, growth factors, interleukins, cytokines, chemokines,enzymes, and ribozymes. Alternately, the heterologous RNAs encode anytherapeutic protein or peptide. As a further alternative, theheterologous RNAs of the present invention encode any immunogenicprotein or peptide.

An immunogenic protein or peptide, or “immunogen,” may be any protein orpeptide suitable for protecting the subject against a disease, includingbut not limited to microbial, bacterial, protozoal, parasitic, and viraldiseases. For example, the immunogen may be an orthomyxovirus immunogen(e.g., an influenza virus immunogen, such as the influenza virushemagglutinin (HA) surface protein or the influenza virus nucleoproteingene, or an equine influenza virus immunogen), or a lentivirus immunogen(e.g., an equine infectious anemia virus immunogen, a SimianImmunodeficiency Virus (SIV) immunogen, or a Human ImmunodeficiencyVirus (HIV) immunogen, such as the HIV envelope GP160 protein and theHIV matrix/capsid proteins). The immunogen may also be an arenavirusimmunogen (e.g., Lassa fever virus immunogen, such as the Lassa fevervirus nucleocapsid protein gene and the Lassa fever envelopeglycoprotein gene), a poxvirus immunogen (e.g., vaccinia), a flavivirusimmunogen (e.g., a yellow fever virus immunogen or a Japaneseencephalitis virus immunogen), a filovirus immunogen (e.g., an Ebolavirus immunogen, or a Marburg virus immunogen), a bunyavirus immunogen(e.g., RVFV, CCHF, and SFS viruses), or a coronavirus immunogen (e.g.,an infectious human coronavirus immunogen, such as the human coronavirusenvelope glycoprotein gene, or a transmissible gastroenteritis virusimmunogen for pigs, or an infectious bronchitis virus immunogen forchickens).

Alternatively, the present invention can be used to express heterologousRNAs encoding antisense oligonucleotides. In general, “antisense” refersto the use of small, synthetic oligonucleotides to inhibit geneexpression by inhibiting the function of the target MRNA containing thecomplementary sequence. Milligan, J. F. et al., J. Med. Chem. 36(14),1923-1937 (1993). Gene expression is inhibited through hybridization tocoding (sense) sequences in a specific mRNA target by hydrogen bondingaccording to Watson-Crick base pairing rules. The mechanism of antisenseinhibition is that the exogenously applied oligonucleotides decrease themRNA and protein levels of the target gene. Milligan, J. F. et al., J.Med. Chem. 36(14), 1923-1937 (1993). See also Helene, C. and Toulme, J.,Biochim. Biophys. Acta 1049, 99-125 (1990); Cohen, J. S., Ed.,OLIGODEOXYNUCLEOTIDES AS ANTISENSE INHIBITORS OF GENE EXPRESSION, CRCPress:Boca Raton, Fla. (1987).

Antisense oligonucleotides may be of any suitable length, depending onthe particular target being bound. The only limits on the length of theantisense oligonucleotide is the- capacity of the virus for insertedheterologous RNA. Antisense oligonucleotides may be complementary to theentire mRNA transcript of the target gene or only a portion thereof.Preferably the antisense oligonucleotide is directed to an mRNA regioncontaining a junction between intron and exon. Where the antisenseoligonucleotide is directed to an intron/exon junction, it may eitherentirely overlie the junction or may be sufficiently close to thejunction to inhibit splicing out of the intervening exon duringprocessing of precursor mRNA to mature mRNA (e.g., with the 3′ or 5′terminus of the antisense oligonucleotide being positioned within about,for example, 10, 5, 3 or 2 nucleotides of the intron/exon junction).Also preferred are antisense oligonucleotides which overlap theinitiation codon.

When practicing the present invention, the antisense oligonucleotidesadministered may be related in origin to the species to which it isadministered. When treating humans, human antisense may be used ifdesired.

Promoters for use in carrying out the present invention are operable inbone marrow cells. An operable promoter in bone marrow cells is apromoter that is recognized by and functions in bone marrow cells.Promoters for use with the present invention must also be operativelyassociated with the heterologous RNA to be expressed in the bone marrow.A promoter is operably linked to a heterologous RNA if it controls thetranscription of the heterologous RNA, where the heterologous RNAcomprises a coding sequence. Suitable promoters are well known in theart. The Sindbis 26S promoter is preferred when the alphavirus is astrain of Sindbis virus. Additional preferred promoters beyond theSindbis 26S promoter include the Girdwood S.A. 26S promoter when thealphavirus is Girdwood S.A., the S.A.AR86 26S promoter when thealphavirus is S.A.AR86, and any other promoter sequence recognized byalphavirus polymerases. Alphavirus promoter sequences containingmutations which alter the activity level of the promoter (in relation tothe activity level of the wild-type) are also suitable in the practiceof the present invention. Such mutant promoter sequences are describedin Raju and Huang, J. Virol. 65, 2501-2510 (1991), the disclosure ofwhich is incorporated in its entirety by reference.

The heterologous RNA is introduced into the bone marrow cells bycontacting the recombinant alphavirus carrying the heterologous RNAsegment to the bone marrow cells. By contacting, it is meant bringingthe recombinant alphavirus and the bone marrow cells in physicalproximity. The contacting step can be performed in vitro or in vivo. Invitro contacting can be carried out with cultures of immortalized ornon-immortalized bone marrow cells. In one particular embodiment, bonemarrow cells can be removed from a subject, cultured in vitro, infectedwith the vector, and then introduced back into the subject. Contactingis performed in vivo when the recombinant alphavirus is administered toa subject. Pharmaceutical formulations of recombinant alphavirus can beadministered to a subject parenterally (e.g., subcutaneous,intracerebral, intradermal, intramuscular, intravenous andintraarticular) administration. Alternatively, pharmaceuticalformulations of the present invention may be suitable for administrationto the mucus membranes of a subject (e.g., intranasal administration, byuse of a dropper, swab, or inhaler). Methods of preparing infectiousvirus particles and pharmaceutical formulations thereof are discussed inmore detail hereinbelow.

By “introducing” the heterologous RNA segment into the bone marrow cellsit is meant infecting the bone marrow cells with recombinant alphaviruscontaining the heterologous RNA, such that the viral vector carrying theheterologous RNA enters the bone marrow cells and can be expressedtherein. As used with respect to the present invention, when theheterologous RNA is “expressed,” it is meant that the heterologous RNAis transcribed. In particular embodiments of the invention in which itis desired to produce a protein or peptide, expression further includesthe steps of post-transcriptional processing and translation of the mRNAtranscribed from the heterologous RNA. In contrast, where theheterologous RNA encodes an antisense oligonucleotide, expression neednot include post-transcriptional processing and translation. Withrespect to embodiments in which the heterologous RNA encodes animmunogenic protein or a protein being administered for therapeuticpurposes, expression may also include the further step ofpost-translational processing to produce an immunogenic ortherapeutically-active protein.

The present invention also provides infectious RNAs, as describedhereinabove, and cDNAs encoding the same. Preferably the infectious RNAsand cDNAs are derived from the S.A.AR86, Girdwood S.A., TR339, orOckelbo viruses. The cDNA clones can be generated by any of a variety ofsuitable methods known to those skilled in the art. A preferred methodis the method set forth in U.S. Pat. No. 5,185,440 to Davis et al., thedisclosure of which is incorporated in its entirety by reference, andGubler et al., Gene 25:263 (1983).

RNA is preferably synthesized from the DNA sequence in vitro usingpurified RNA polymerase in the presence of ribonucleotide triphosphatesand cap analogs in accordance with conventional techniques. However, theRNA may also be synthesized intracellularly after introduction of thecDNA.

A. Double Promoter Vectors.

In one embodiment of the invention, double promoter vectors are used tointroduce the heterologous RNA into the target bone marrow cells. Adouble promoter virus vector is a replication and propagation competentvirus. Double promoter vectors are described in U.S. Pat. No. 5,505,947to Johnston et al., the disclosure of which is incorporated in itsentirety by reference. Preferred alphaviruses for constructing thedouble promoter vectors are S.A.AR86, Girdwood S.A., TR339 and Ockelboviruses. More preferably, the double promoter vector contains one ormore attenuating mutations. Attenuating mutations are described in moredetail hereinabove.

The double promoter vector is constructed so as to contain a secondsubgenomic promoter (i.e., 26S promoter) inserted 3′ to the virus RNAencoding the structural proteins. The heterologous RNA is insertedbetween the second subgenomic promoter, so as to be operativelyassociated therewith, and the 3′ UTR of the virus genome. HeterologousRNA sequences of less than 3 kilobases, more preferably those less than2 kilobases, and more preferably still those less than 1 kilobase, canbe inserted into the double promoter vector. In a preferred embodimentof the invention, the double promoter vector is derived from GirdwoodS.A., and the second subgenomic promoter is a duplicate of the GirdwoodS.A. subgenomic promoter. In an alternate preferred embodiment, thedouble promoter vector is derived from TR339, and the second subgenomicpromoter is a duplicate of the TR339 subgenomic promoter.

B. Replicon Vectors.

Replicon vectors, which are propagation-defective virus vectors can alsobe used to carry out the present invention. Replicon vectors aredescribed in more detail in copending U.S. application Ser. No.08/448,630 to Johnston et al., the disclosure of which is incorporatedin its entirety by reference. Preferred alphaviruses for constructingthe replicon vectors are S.A.AR86, Girdwood S.A., TR339, and Ockelbo.

In general, in the replicon system, a foreign gene to be expressed isinserted in place of at least one of the viral structural protein genesin a transcription plasmid containing an otherwise full-length cDNA copyof the alphavirus genome RNA. RNA transcribed from this plasmid containsan intact copy of the viral nonstructural genes which are responsiblefor RNA replication and transcription. Thus, if the transcribed RNA istransfected into susceptible cells, it will be replicated and translatedto give the nonstructural proteins. These proteins will transcribe thetransfected RNA to give high levels of subgenomic mRNA, which will thenbe translated to produce high levels of the foreign protein. Theautonomously replicating RNA (i.e., replicon) can only be packaged intovirus particles if the alphavirus structural protein genes are providedon one or more “helper” RNAs, which are cotransfected into cells alongwith the replicon RNA. The helper RNAs do not contain the viralnonstructural genes for replication, but these functions are provided intrans by the replicon RNA. Similarly, the transcriptase functionstranslated from the replicon RNA transcribe the structural protein geneson the helper RNA, resulting in the synthesis of viral structuralproteins and packaging of the replicon into virus-like particles. As thepackaging or encapsidation signal for alphavirus RNAs is located withinthe nonstructural genes, the absence of these sequences in the helperRNAs precludes their incorporation into virus particles.

Alphavirus-permissive cells employed in the methods of the presentinvention are cells which, upon transfection with the viral RNAtranscript, are capable of producing viral particles. Preferredalphavirus-permissive cells are TR339-permissive cells, GirdwoodS.A.-permissive cells, S.A.AR86-permissive cells, and Ockelbo-permissivecells. Alphaviruses have a broad host range. Examples of suitable hostcells include, but are not limited to Vero cells, baby hamster kidney(BHK) cells, and chicken embryo fibroblast cells.

The phrase “structural protein” as used herein refers to the encodedproteins which are required for encapsidation (e.g., packaging) of theRNA replicon, and include the capsid protein, E1 glycoprotein, and E2glycoprotein. As described hereinabove, the structural proteins of thealphavirus are distributed among one or more helper RNAs (i.e., a firsthelper RNA and a second helper RNA). In addition, one or more structuralproteins may be located on the same RNA molecule as the replicon RNA,provided that at least one structural protein is deleted from thereplicon RNA such that the resulting alphavirus particle is propagationdefective. As used herein, the terms “deleted” or “deletion” mean eithertotal deletion of the specified segment or the deletion of a sufficientportion of the specified segment to render the segment inoperative ornonfunctional, in accordance with standard usage. See, e.g., U.S. Pat.No. 4,650,764 to Temin et al. The term “propagation defective” as usedherein, means that the replicon RNA cannot be encapsidated in the hostcell in the absence of the helper RNA. The resulting alphavirus repliconparticles are propagation defective inasmuch as the replicon RNA inthese particles does not include all of the alphavirus structuralproteins required for encapsidation, at least one of the requiredstructural proteins being deleted therefrom, such that the replicon RNAinitiates only an abortive infection; no new viral particles areproduced, and there is no spread of the infection to other cells.

The helper cell for expressing the infectious, propagation defectivealphavirus particle comprises a set of RNAs, as described above. The setof RNAs principally include a first helper RNA and a second helper RNA.The first helper RNA includes RNA encoding at least one alphavirusstructural protein but does not encode all alphavirus structuralproteins. In other words, the first helper RNA does not encode at leastone alphavirus structural protein; the at least one non-coded alphavirusstructural protein being deleted from the first helper RNA. In oneembodiment, the first helper RNA includes RNA encoding the alphavirus E1glycoprotein, with the alphavirus capsid protein and the alphavirus E2glycoprotein being deleted from the first helper RNA. In anotherembodiment, the first helper RNA includes RNA encoding the alphavirus E2glycoprotein, with the alphavirus capsid protein and the alphavirus E1glycoprotein being deleted from the first helper RNA. In a third,preferred embodiment, the first helper RNA includes RNA encoding thealphavirus E1 glycoprotein and the alphavirus E2 glycoprotein, with thealphavirus capsid protein being deleted from the first helper RNA.

The second helper RNA includes RNA encoding at least one alphavirusstructural protein which is different from the at least one structuralprotein encoded by the first helper RNA. Thus, the second helper RNAencodes at least one alphavirus structural protein which is not encodedby the first helper RNA. The second helper RNA does not encode the atleast one alphavirus structural protein which is encoded by the firsthelper RNA, thus the first and second helper RNAs do not encodeduplicate structural proteins. In the embodiment wherein the firsthelper RNA includes RNA encoding only the alphavirus E1 glycoprotein,the second helper RNA may include RNA encoding one or both of thealphavirus capsid protein and the alphavirus E2 glycoprotein which aredeleted from the first helper RNA. In the embodiment wherein, the firsthelper RNA includes RNA encoding only the alphavirus E2 glycoprotein,the second helper RNA may include RNA encoding one or both of thealphavirus capsid protein and the alphavirus E1 glycoprotein which aredeleted from the first helper RNA. In the embodiment wherein the firsthelper RNA includes RNA encoding both the alphavirus E1 glycoprotein andthe alphavirus E2 glycoprotein, the second helper RNA may include RNAencoding the alphavirus capsid protein which is deleted from the firsthelper RNA.

In one embodiment, the packaging segment (RNA comprising theencapsidation or packaging signal) is deleted from at least the firsthelper RNA. In a preferred embodiment, the packaging segment is deletedfrom both the first helper RNA and the second helper RNA.

In the preferred embodiment wherein the packaging segment is deletedfrom both the first helper RNA and the second helper RNA, the helpercell is co-transfected with a replicon RNA in addition to the firsthelper RNA and the second helper RNA. The replicon RNA encodes thepackaging segment and an inserted heterologous RNA. The insertedheterologous RNA may be RNA encoding a protein or a peptide. In apreferred embodiment, the replicon RNA, the first helper RNA and thesecond helper RNA are provided on separate molecules such that a firstmolecule, i.e., the replicon RNA, includes RNA encoding the packagingsegment and the inserted heterologous RNA, a second molecule, i.e., thefirst helper RNA, includes RNA encoding at least one but not all of therequired alphavirus structural proteins, and a third molecule, i.e., thesecond helper RNA, includes RNA encoding at least one but not all of therequired alphavirus structural proteins. For example, in one preferredembodiment of the present invention, the helper cell includes a set ofRNAs which include (a) a replicon RNA including RNA encoding analphavirus packaging sequence and an inserted heterologous RNA, (b) afirst helper RNA including RNA encoding the alphavirus E1 glycoproteinand the alphavirus E2 glycoprotein, and (c) a second helper RNAincluding RNA encoding the alphavirus capsid protein so that thealphavirus E1 glycoprotein, the alphavirus E2 glycoprotein and thecapsid protein assemble together into alphavirus particles in the hostcell.

In an alternate embodiment, the replicon RNA and the first helper RNAare on separate molecules, and the replicon RNA and RNA encoding astructural gene not encoded by the first helper RNA are on anothersingle molecule together, such that a first molecule, i.e., the firsthelper RNA, including RNA encoding at least one but not all of therequired alphavirus structural proteins, and a second molecule, i.e.,the replicon RNA, including RNA encoding the packaging segment, theinserted heterologous RNA, and the remaining structural proteins notencoded by the first helper RNA. For example, in one preferredembodiment of the present invention, the helper cell includes a set ofRNAs including (a) a replicon RNA including RNA encoding an alphaviruspackaging sequence, an inserted heterologous RNA, and an alphaviruscapsid protein, and (b) a first helper RNA including RNA encoding thealphavirus E1 glycoprotein and the alphavirus E2 glycoprotein so thatthe alphavirus E1 glycoprotein, the alphavirus E2 glycoprotein and thecapsid protein assemble together into alphavirus particles in the hostcell, with the replicon RNA packaged therein.

In one preferred embodiment of the present invention, the RNA encodingthe alphavirus structural proteins, i.e., the capsid, E1 glycoproteinand E2 glycoprotein, contains at least one attenuating mutation, asdescribed hereinabove. Thus, according to this embodiment, at least oneof the first helper RNA and the second helper RNA includes at least oneattenuating mutation. In a more preferred embodiment, at least one ofthe first helper RNA and the second helper RNA includes at least two, ormultiple, attenuating mutations. The multiple attenuating mutations maybe positioned in either the first helper RNA or in the second helperRNA, or they may be distributed randomly with one or more attenuatingmutations being positioned in the first helper RNA and one or moreattenuating mutations positioned in the second helper RNA.Alternatively, when the replicon RNA and the RNA encoding the structuralproteins not encoded by the first helper RNA are located on the samemolecule, an attenuating mutation may be positioned in the RNA whichcodes for the structural protein not encoded by the first helper RNA.The attenuating mutations may also be located within the RNA encodingnon-structural proteins (e.g., the replicon RNA).

Preferably, the first helper RNA and the second helper RNA also includea promoter. It is also preferred that the replicon RNA also includes apromoter. Suitable promoters for inclusion in the first helper RNA,second helper RNA and replicon RNA are well known in the art. Onepreferred promoter is the Girdwood S.A. 26S promoter for use when thealphavirus is Girdwood S.A. Another preferred promoter is the TR339 26Spromoter for use when the alphavirus is TR339. Additional promotersbeyond the Girdwood S.A. and TR339 promoters include the VEE 26Spromoter, the Sindbis 26S promoter, the Semliki Forest 26S promoter, andany other promoter sequence recognized by alphavirus polymerases.Alphavirus promoter sequences containing mutations which alter theactivity level of the promoter (in relation to the activity level of thewild-type) are also suitable in the practice of the present invention.Such mutant promoter sequences are described in Raju and Huang, J.Virol. 65, 2501-2510 (1991), the disclosure of which is incorporatedherein in its entirety. In the system wherein the first helper RNA, thesecond helper RNA, and the replicon RNA are all on separate molecules,the promoters, if the same promoter is used for all three RNAs, providea homologous sequence between the three molecules. It is preferred thatthe selected promoter is operative with the non-structural proteinsencoded by the replicon RNA molecule.

In cases where vaccination with two immunogens provides improvedprotection against disease as compared to vaccination with only a singleimmunogen, a double-promoter replicon would ensure that both immunogensare produced in the same cell. Such a replicon would be the same as theone described above, except that it would contain two copies of the 26SRNA promoter, each followed by a different multiple cloning site, toallow for the insertion and expression of two different heterologousproteins. Another useful strategy is to insert the IRES sequence fromthe picornavirus, EMC virus, between the two heterologous genesdownstream from the single 26S promoter of the replicon described above,thus leading to expression of two immunogens from the single replicontranscript in the same cell.

C. Uses of the Present Invention.

The alphavirus vectors, RNAs, cDNAs, helper cells, infectious virusparticles, and methods of the present invention find use in in vitroexpression systems, wherein the inserted heterologous RNA encodes aprotein or peptide which is desirably produced in vitro. The RNAs,cDNAs, helper cells, infectious virus particles, methods, andpharmaceutical formulations of the present invention are additionallyuseful in a method of administering a protein or peptide to a subject inneed of the protein or peptide, as a method of treatment or otherwise.In this embodiment of the invention, the heterologous RNA encodes thedesired protein or peptide, and pharmaceutical formulations of thepresent invention are administered to a subject in need of the desiredprotein or peptide. In this manner, the protein or peptide may thus beproduced in vivo in the subject. The subject may be in need of theprotein or peptide because the subject has a deficiency thereof, orbecause the production of the protein or peptide in the subject mayimpart some therapeutic effect, as a method of treatment or otherwise.

Alternately, the claimed methods provide a vaccination strategy, whereinthe heterologous RNA encodes an immunogenic protein or peptide.

The methods and products of the invention are also useful as antigensand for evoking the production of antibodies in animals such as horsesand rabbits, from which the antibodies may be collected and then used indiagnostic assays in accordance with known techniques.

A further aspect of the present invention is a method of introducing andexpressing antisense oligonucleotides in bone marrow cell cultures toregulate gene expression. Alternately, the claimed method finds use inintroducing and expressing a protein or peptide in bone marrow cellcultures.

II. Girdwood S.A. and TR339 Clones

Disclosed hereinbelow are genomic RNA sequences encoding live GirdwoodS.A. virus, live S.A.AR86 virus, and live Sindbis strain TR339 virus,cDNAs derived therefrom, infectious RNA transcripts encoded by thecDNAs, infectious viral particles containing the infectious RNAtranscripts, and pharmaceutical formulations derived therefrom.

The cDNA sequence of Girdwood S.A. is given herein as SEQ ID NO:4.Alternatively, the cDNA may have a sequence which differs from the cDNAof SEQ ID NO:4, but which has the same protein sequence as the cDNAgiven herein as SEQ ID NO:4. Thus, the cDNA may include one or moresilent mutations.

The phrase “silent mutation” as used herein refers to mutations in thecDNA coding sequence which do not produce mutations in the correspondingprotein sequence translated therefrom.

Likewise, the cDNA sequence of TR339 is given herein as SEQ ID NO:8.Alternatively, the cDNA may have a sequence which differs from the cDNAof SEQ ID NO:8, but which has the same protein sequence as the cDNAgiven herein as SEQ ID NO:8. Thus, the cDNA may include one or moresilent mutations.

The cDNAs encoding infectious Girdwood S.A. and TR339 virus RNAtranscripts of the present invention include those homologous to, andhaving essentially the same biological properties as, the cDNA sequencesdisclosed herein as SEQ ID NO:4 and SEQ ID NO:8, respectively. Thus,cDNAs that hybridize to cDNAs encoding infectious Girdwood S.A. or TR339virus RNA transcripts disclosed herein are also an aspect of thisinvention. Conditions which will permit other cDNAs encoding infectiousGirdwood S.A. or TR339 virus transcripts to hybridize to the cDNAsdisclosed herein can be determined in accordance with known techniques.For example, hybridization of such sequences may be carried out underconditions of reduced stringency, medium stringency, or even highstringency conditions (e.g., conditions represented by a wash stringencyof 35-40% formamide with 5× Denhardt's solution, 0.5% SDS and 1× SSPE at37° C.; conditions represented by a wash stringency of 40-45% formamidewith 5× Denhardt's solution, 0.5% SDS, and 1× SSPE at 42° C.; andconditions represented by a wash stringency of 50% formamide with 5×Denhardt's solution, 0.5% SDS and 1× SSPE at 42° C., respectively, tocDNA encoding infectious Girdwood S.A. or TR339 virus RNA transcriptsdisclosed herein in a standard hybridization assay. See J. SAMBROOK ETAL., MOLECULAR CLONING: A LABORATORY MANUAL (2d ed. 1989)). In general,cDNA sequences encoding infectious Girdwood S.A. or TR339 virus RNAtranscripts that hybridize to the cDNAs disclosed herein will be atleast 30% homologous, 50% homologous, 75% homologous, and even 95%homologous or more with the cDNA sequences encoding infectious GirdwoodS.A. or TR339 virus RNA transcripts disclosed herein.

Promoter sequences and Girdwood S.A. virus or Sindbis virus strain TR339cDNA clones are operatively associated in the present invention suchthat the promoter causes the cDNA clone to be transcribed in thepresence of an RNA polymerase which binds to the promoter. The promoteris positioned on the 5′ end (with respect to the virion RNA sequence),of the cDNA clone. An excessive number of nucleotides between thepromoter sequence and the cDNA clone will result in the inoperability ofthe construct. Hence, the number of nucleotides between the promotersequence and the cDNA clone is preferably not more than eight, morepreferably not more than five, still more preferably not more thanthree, and most preferably not more than one.

Examples of promoters which are useful in the cDNA sequences of thepresent invention include, but are not limited to T3 promoters, T7promoters, cytomegalovirus (CMV) promoters, and SP6 promoters. The DNAsequence of the present invention may reside in any suitabletranscription vector. The DNA sequence preferably has a complementaryDNA sequence bound thereto so that the double-stranded sequence willserve as an active template for RNA polymerase. The transcription vectorpreferably comprises a plasmid. When the DNA sequence comprises aplasmid, it is preferred that a unique restriction site be provided 3′(with respect to the virion RNA sequence) to the CDNA clone. Thisprovides a means for linearizing the DNA sequence to allow thetranscription of genome-length RNA in vitro.

The cDNA clones can be generated by any of a variety of suitable methodsknown to those skilled in the art. A preferred method is the method setforth in U.S. Pat. No. 5,185,440 to Davis et al., the disclosure ofwhich is incorporated in its entirety by reference, and Gubler et al.,Gene 25:263 (1983).

RNA is preferably synthesized from the DNA sequence in vitro usingpurified RNA polymerase in the presence of ribonucleotide triphosphatesand cap analogs in accordance with conventional techniques. However, theRNA may also be synthesized intracellularly after introduction of thecDNA.

The Girdwood S.A. and TR339 cDNA clones and the infectious RNAs andinfectious virus particles produced therefrom of the present inventionare useful for the preparation of pharmaceutical formulations, such asvaccines. In addition, the cDNA clones, infectious RNAs, and infectiousviral particles of the present invention are useful for administrationto animals for the purpose of producing antibodies to the Girdwood S.A.virus or the Sindbis virus strain TR339, which antibodies may becollected and used in known diagnostic techniques for the detection ofGirdwood S.A. virus or Sindbis virus strain TR339. Antibodies can alsobe generated to the viral proteins expressed from the cDNAs disclosedherein. As another aspect of the present invention, the claimed cDNAclones are useful as nucleotide probes to detect the presence ofGirdwood S.A. or TR339 genomic RNA or transcripts.

III. Infectious Virus Particles and Pharmaceutical Formulations

The infectious virus particles of the present invention include thosecontaining double promoter vectors and those containing replicon vectorsas described hereinabove. Alternately, the infectious virus particlescontain infectious RNAs encoding the Girdwood S.A. or TR339 genome. Whenthe infectious RNA comprises the Girdwood S.A. genome, preferably theRNA has the sequence encoded by the cDNA given as SEQ ID NO:4. When theinfectious RNA comprises the TR339 genome, preferably the RNA has thesequence encoded by the cDNA given as SEQ ID NO:8.

The infectious, alphavirus particles of the present invention may beprepared according to the methods disclosed herein in combination withtechniques known to those skilled in the art. These methods includetransfecting an alphavirus-permissive cell with a replicon RNA includingthe alphavirus packaging segment and an inserted heterologous RNA, afirst helper RNA including RNA encoding at least one alphavirusstructural protein, and a second helper RNA including RNA encoding atleast one alphavirus structural protein which is different from thatencoded by the first helper RNA. Alternately, and preferably, at leastone of the helper RNAs is produced from a cDNA encoding the helper RNAand operably associated with an appropriate promoter, the cDNA beingstably transfected and integrated into the cells. More preferably, allof the helper RNAs will be “launched” from stably transfected cDNAs. Thestep of transfecting the alphavirus-permissive cell can be carried outaccording to any suitable means known to those skilled in the art, asdescribed above with respect to propagation-competent viruses.

Uptake of propagation-competent RNA into the cells in vitro can becarried out according to any suitable means known to those skilled inthe art. Uptake of RNA into the cells can be achieved, for example, bytreating the cells with DEAE-dextran, treating the RNA with LIPOFECTIN®before addition to the cells, or by electroporation, withelectroporation being the currently preferred means. These techniquesare well known in the art. See e.g., U.S. Pat. No. 5,185,440 to Davis etal., and PCT Publication No. WO 92/10578 to Bioption AB, the disclosuresof which are incorporated herein by reference in their entirety. Uptakeof propagation-competent RNA into the cell in vivo can be carried out byadministering the infectious RNA to a subject as described in Section Iabove.

The infectious RNAs may also contain a heterologous RNA segment, wherethe heterologous RNA segment contains a heterologous RNA and a promoteroperably associated therewith. It is preferred that the infectious RNAintroduces and expresses the heterologous RNA in bone marrow cells asdescribed in Section I above. According to this embodiment, it ispreferable that the promoter operatively associated with theheterologous RNA is operable in bone marrow cells. The heterologous RNAmay encode any protein or peptide, preferably an immunogenic protein orpeptide, a therapeutic protein or peptide, a hormone, a growth factor,an interleukin, a cytokine, a chemokine, an enzyme, a ribozyme, or anantisense oligonucleotide as described in more detail in Section Iabove.

The step of facilitating the production of the infectious viralparticles in the cells may be carried out using conventional techniques.See e.g., U.S. Pat. No.5,185,440 to Davis et al., PCT Publication No. WO92/10578 to Bioption AB, and U.S. Pat. No. 4,650,764 to Temin et al.(although Temin et al ., relates to retroviruses rather thanalphaviruses). The infectious viral particles may be produced bystandard cell culture growth techniques.

The step of collecting the infectious virus particles may also becarried out using conventional techniques. For example, the infectiousparticles may be collected by cell lysis, or collection of thesupernatant of the cell culture, as is known in the art. See e.g., U.S.Pat. No. 5,185,440 to Davis et al., PCT Publication No. WO 92/10578 toBioption AB, and U.S. Pat. No. 4,650,764 to Temin et al. Other suitabletechniques will be known to those skilled in the art. Optionally, thecollected infectious virus particles may be purified if desired.Suitable purification techniques are well known to those skilled in theart.

Pharmaceutical formulations, such as vaccines, of the present inventioncomprise an immunogenic amount of the infectious, virus particles incombination with a pharmaceutically acceptable carrier. An “immunogenicamount” is an amount of the infectious virus particles which issufficient to evoke an immune response in the subject to which thepharmaceutical formulation is administered. An amount of from about 10³to about 10⁷ particles, and preferably about 10⁴ to 10⁶ particles perdose is believed suitable, depending upon the age and species of thesubject being treated, and the immunogen against which the immuneresponse is desired.

Pharmaceutical formulations of the present invention for therapeutic usecomprise a therapeutic amount of the infectious virus particles incombination with a pharmaceutically acceptable carrier. A “therapeuticamount” is an amount of the infectious virus particles which issufficient to produce a therapeutic effect (e.g., triggering an immuneresponse or supplying a protein to a subject in need thereof) in thesubject to which the pharmaceutical formulation is administered. Thetherapeutic amount will depend upon the age and species of the subjectbeing treated, and the therapeutic protein or peptide beingadministered. Typical dosages are an amount from about 10¹ to about 10⁵infectious units.

Exemplary pharmaceutically acceptable carriers include, but are notlimited to, sterile pyrogen-free water and sterile pyrogen-freephysiological saline solution. Subjects which may be administeredimmunogenic amounts of the infectious virus particles of the presentinvention include but are not limited to human and animal (e.g., pig,cattle, dog, horse, donkey, mouse, hamster, monkeys) subjects.

Pharmaceutical formulations of the present invention include thosesuitable for parenteral (e.g., subcutaneous, intracerebral, intradermal,intramuscular, intravenous and intraarticular) administration.Alternatively, pharmaceutical formulations of the present invention maybe suitable for administration to the mucus membranes of a subject(e.g., intranasal administration by use of a dropper, swab, or inhaler).The formulations may be conveniently prepared in unit dosage form andmay be prepared by any of the methods well known in the art.

The following examples are provided to illustrate the present invention,and should not be construed as limiting thereof. In these examples, PBSmeans phosphate buffered saline, EDTA means ethylene diaminetetraacetate, ml means milliliter, μl means microliter, mM meansmillimolar, μM means micromolar, u means unit, PFU means plaque formingunits, g means gram, mg means milligram, μg means microgram, cpm meanscounts per minute, ic means intracerebral or intracerebrally, ip meansintraperitoneal or intraperitoneally, iv means intravenous orintravenously, and sc means subcutaneous or subcutaneously.

Amino acid sequences disclosed herein are presented in the amino tocarboxyl direction, from left to right. The amino and carboxyl groupsare not presented in the sequence. Nucleotide sequences are presentedherein by single strand only in the 5′ to 3′ direction, from left toright. Nucleotides and amino acids are represented herein in the mannerrecommended by the IUPAC-IUB Biochemical Nomenclature Commission, or(for amino acids) by either one letter or three letter code, inaccordance with 37 CFR §1.822 and established usage. Where one letteramino acid code is used, the same sequence is also presented elsewherein three letter code.

EXAMPLE I Cells and Virus Stocks

S.A.AR86 was isolated in 1954 from a pool of Culex sp. mosquitoescollected near Johannesburg, South Africa. Weinbren et al., S. Afr. Med.J. 30, 631-36 (1956). Ockelbo82 was isolated from Culiseta sp.mosquitoes collected in Edsbyn, Sweden in 1982 and was associatedserologically with human disease. Nikldasson et al., Am. J. Trop. Med.Hyg. 33, 1212-17 (1984). Girdwood S.A. was isolated from a human patientin the Johannesburg area of South Africa in 1963. Malherbe et al., S.Afr. Med. J. 37, 547-52 (1963). Molecularly cloned virus TR339represents the deduced consensus sequence of Sindbis AR339. McKnight etal., J. Virol. 70, 1981-89 (1996); William Klimstra, personalcommunication. TRSB is a laboratory strain of Sindbis isolate AR339derived from a cDNA clone pTRSB and differing from the AR339 consensussequence at three codons. McKnight et al., J. Virol. 70, 1981-89 (1996).pTR5000 is a full-length cDNA clone of Sindbis AR339 following the SP6phage promoter and containing mostly Sindbis AR339 sequences.

Stocks of all molecularly cloned viruses were prepared byelectroporating genome length in vitro transcripts of their respectivecDNA clones in BHK-21 cells. Heidner et al., J. Virol. 68, 2683-92(1994). Girdwood S.A. (Malherbe et al., S. Afr. Med. J. 37, 547-52(1963)) and Ockelbo82 (Espmark and Niklasson, Am. J. Trop. Med. Hyg. 33,1203-11 (1984); Niklasson et al., Am. J. Trop. Med. Hyg. 33, 1212-17(1984)) were passed one to three times in BHK-21 cells in order toproduce amplified stocks of virus. All virus stocks were stored at −70°C. until needed. The titers of the virus stocks were determined onBHK-21 cells from aliquots of frozen virus.

EXAMPLE 2 Cloning the S.A.AR86 and Girdwood S.A. Genomic Sequences

The sequences of S.A.AR86 (SEQ ID NO: 1) and Girdwood S.A. (SEQ ID NO:4)were determined from uncloned reverse transcriptase-polymerase chainreaction (RT-PCR) fragments amplified from virion RNA. Heidner et al.,J. Virol. 68, 2683-92 (1994). The sequence of the 5′ 40 nucleotides wasdetermined by directly sequencing the genomic RNA. Sanger et al., Proc.Natl. Acad. Sci. USA 74, 5463-67 (1977); Zimmern and Kaesberg, Proc.Natl. Acad. Sci. USA 75, 4257-61 (1978); Ahlquist et al., Cell 23,183-89 (1981).

The S.A.AR86 genome was 11,663 nucleotides in length, excluding the 5′CAP and 3′ poly(A) tail, 40 nucleotides shorter than the alphavirusprototype Sindbis strain AR339. Strauss et al., Virology 133, 92-110(1984). Compared with the consensus sequence of Sindbis virus AR339(McKnight et al., J. Virol. 70 1981-89 (1996)), S.A.AR86 contained twoseparate 6-nucleotide insertions, and one 3-nucleotide insertion in the3′ half of the nsP3 gene, a region not well conserved amongalphaviruses. The two 6-nucleotide insertions were found immediately 3′of nucleotides 5403 and 5450, and the 3-nucleotide insertion wasimmediately 3′ of nucleotide 5546 compared with the AR339 genome. Inaddition, S.A.AR86 contained a 54-nucleotide deletion in nsP3 whichspanned nucleotides 5256 to 5311 of AR339. As a result of thesedeletions and insertions, S.A.AR86 nsP3 was 13 amino acids smaller thanAR339, containing an 18-amino acid deletion and a total of 5 aminoacids. inserted. The 3′ untranslated region of S.A.AR86 contained, withrespect to AR339, two 1-nucleotide deletions at nucleotides 11,513 and11,602, and one 1-nucleotide insertion following nucleotide 11,664. Thetotal numbers of nucleotides and predicted amino acids comprising theremaining genes of S.A.AR86 were identical to those of AR339.

The cDNA sequence of S.A.AR86 is presented in SEQ ID NO:1. Nucleotides 1through 59 represent the 5′ UTR, the non-structural polyprotein isencoded by nucleotides 60 through 7559 (nsP1--nt60 through nt1679;nsP2--nt1680 through nt4099; nsP3--nt4100 through nt5729; nsP4--nt5730through nt7559), the structural polyprotein is encoded by nucleotides7608 through 11342 (capsid--nt7608 through nt8399; E3--nt8400 throughnt8591; E2--nt8592 through nt9860; 6K--nt9861 through nt10025;E1--nt10026 through nt11342), and the 3′ UTR is represented bynucleotides 11346 through 11663.

A notable feature of the deduced amino acid sequence of S.A.AR86 (SEQ IDNO:2 and SEQ ID NO:3) was the cysteine codon in place of an opaltermination codon between nsP3 and nsP4. S.A.AR86 is the only alphavirusof the Sindbis group, and one of just three alphavirus isolatessequenced to date, which do not contain an opal termination codonbetween nsP3 and nsP4. Takkinen, K., Nucleic Acids Res. 14, 5667-5682(1986); Strauss et al., Virology 164, 265-74 (1988).

The genome of Girdwood S.A. was 11,717 nucleotides long excluding the 5′CAP and 3′ poly(A) tail. The nucleotide sequence (SEQ ID NO:4) of theGirdwood S.A. genome and the putative amino acid sequence (SEQ ID NO:5and SEQ ID NO:6) of the Girdwood S.A. gene products are shown in theaccompanying sequence listings. Position 1902 in SEQ ID NO:5 indicatesthe position of the opal termination codon in the coding region of thenonstructural polyprotein. The extra nucleotides relative to AR339 werein the nonconserved half of nsP3, which contained insertions totalling15 nucleotides, and in the 3′ untranslated region which contained two1-nucleotide deletions and a 1-nucleotide insertion with respect to theconsensus Sindbis AR339 genome. The insertions found in the nsP3 gene ofGirdwood S.A. were identical in position and content to those found inS.A.AR86, although Girdwood S.A. did not have the large nsP3 deletioncharacteristic of S.A.AR86. The remaining portions of the genomecontained the same number of nucleotides and predicted amino acids asSindbis AR339.

The cDNA sequence of Girdwood S.A. is presented in SEQ ID NO:4. An “N”in the sequence indicates that the identity of the nucleotide at thatposition is unknown. Nucleotides 1 through 59 represent the 5′ UTR, thenon-structural polyprotein is encoded by nucleotides 60 through 7613(nsP1--nt60 through nt1679; nsP2--nt1680 through nt4099; nsP3--nt4100through nt5762 or nt5783; nsP4--nt5784 through nt7613), the structuralpolyprotein is encoded by nucleotides 7662 through 11396 (capsid--nt7662through nt8453; E3--nt8454 through nt8645; E2--nt8646 through nt9914;6K--9915 through nt10079; E1--nt10080 through nt11396), and the 3′ UTRis represented by nucleotides 11400 through 11717. There is an opaltermination codon at nucleotides 5763 through 5765.

Overall, Girdwood S.A. was 94.5% identical to the consensus SindbisAR339 sequence, differing at 655 nucleotides not including theinsertions and deletions. These nucleotide differences resulted in 88predicted amino acid changes or a difference of 2.3%. A plurality ofamino acid differences were concentrated in the nsP3 gene, whichcontained 32 of the amino acid changes, 25 of which were in thenonconserved 3′ half.

The Girdwood S.A. nucleotides at positions 1, 3, and 11,717 could not beresolved. Because the primer used during the RT-PCR amplification of the3′ end of the genome assumed a cytosine in the 3′ terminal position, theidentity of this nucleotide could not be determined with certainty.However, in all alphaviruses sequenced to date there is a cytosine inthis position. This, combined with the fact that no difficulty wasencountered in obtaining RT-PCR product for this region with anoligo(dT) primer ending with a 3′ G, suggested that Girdwood S.A. alsocontains a cytosine at this position. The ambiguity at nucleotidepositions 1 and 3 resulted from strong stops encountered during the RNAsequencing.

EXAMPLE 3 Comparison of S.A.AR86 and Girdwood S.A. Sequences With OtherSindbis-Related Virus Sequences

Table 1 examines the relationship of S.A.AR86 and Girdwood S.A. to eachother and to other Sindbis-related viruses. This was accomplished byaligning the nucleotide and deduced amino acid sequences of Ockelbo82,AR339 and Girdwood S.A. to those of S.A.AR86 and then calculating thepercentage identity for each gene using the programs contained withinthe Wisconsin-GCG package (Genetics Computer Group, 575 Science Drive,Madison, Wis. 53711); as described in more detail in McKnight et al., J.Virol. 70, 1981-89 (1996).

The analysis suggests that S.A.AR86 is most similar to the other SouthAfrican isolate, Girdwood S.A., and that the South African isolates aremore similar to the Swedish Ockelbo82 isolate than to the EgyptianSindbis AR339 isolate. These results also suggest that it is unlikelythat S.A.AR86 is a recombinant virus like WEE virus. Hahn et al., Proc.Natl. Acad. Sci. USA 85, 5997-6001 (1988).

TABLE 1 Comparison of the Nucleotide and Amino Acid Sequences ofS.A.AR86 Virus with Those of Sindbis AR339, Ockelbo82, and Girdwood S.A.Viruses^(a) Nucleotide Differences^(b) Amino Acid Differences^(b) AR339OCK82 GIRD AR339 OCK82 GIRD Regions Number (%) Number (%) 5′untranslated 0 (0.0) 0 (0.0) 1 (1.7) — — — nsP1 76 (4.7) 37 (2.3) 15(0.9) 9 (1.7) 6 (1.1) 2 (0.4) nsP2 137 (5.7) 86 (3.6) 45 (1.9) 15 (1.9)8 (1.0) 12 (1.5) nsP3 Conserved^(c) 51 (5.7) 35 (3.9) 13 (1.6) 6 (2.0) 1(0.3) 1 (0.4) Nonconserved^(d) 116 (6.6) 83 (4.4) 70 (2.2) 45 (9.7) 34(7.0) 27 (3.7) nsP4 111 (6.1) 68 (3.7) 19 (1.1) 8 (1.3) 2 (0.3) 4 (0.6)26s junction 1 (2.1) 0 (0.0) 1 (2.1) — — — Capsid 36 (4.5) 26 (3.3) 7(0.9) 1 (0.4) 3 (1.1) 0 (0.0) E3 17 (8.9) 5 (2.6) 4 (2.1) 1 (1.6) 0(0.0) 0 (0.0) E2 71 (5.6) 43 (3.4) 18 (1.4) 12 (2.6) 6 (1.4) 2 (0.5) 6K10 (6.1) 9 (5.4) 4 (2.4) 2 (3.6) 2 (3.6) 1 (1.8) E1 49 (3.7) 31 (2.3) 16(1.2) 7 (1.6) 6 (1.4) 2 (0.9) 3′ untranslated 14 (4.5) 8 (2.5) 1 (0.3) —— — Totals 689 (5.5) 431 (3.3) 214 (1.4) 106 (2.3) 68 (1.4) 51 (0.9)^(a)All nucleotide positions and gene boundaries are numbered accordingto those used for the Sindbis AR339, HR_(sp) variant Genebank AccessionNo. J02363; Strauss et al., Virology 133, 92-110 (1984). ^(b)Differencesinclude insertions and deletions. ^(c)Conserved region nucleotides 4100to 5000 (aa 1 to aa300). ^(d)Nonconserved region nucleotides 5001 to5729 (aa301 to aa5421 S.A.AR86 numbering).

EXAMPLE 4 Neurovirulence of S.A.AR86 and Girdwood S.A.

Girdwood S.A., Ockelbo82, and S.A.AR86 are related by sequence; incontrast, it has previously been reported that only S.A.AR86 displayedthe adult mouse neurovirulence phenotype. Russell et al., J. Virol. 63,1619-29 (1989). These findings were confirmed by the presentinvestigations. Briefly, groups of four female CD-1 mice (3-6 weeks ofage) were inoculated ic with 10³ plaque-forming units (PFU) of S.A.AR86,Girdwood S.A., or Ockelbo82. Neither Girdwood S.A. nor Ockelbo82infection produced any clinical signs of infection. Infection withS.A.AR86 produced neurological signs within four to five days andultimately killed 100% of the mice as previously demonstrated.

Table 2 lists those amino acids of S.A.AR86 which might explain theneurovirulence phenotype in adult mice. A position was scored aspotentially related to the S.A.AR86 adult neurovirulence phenotype ifthe S.A.AR86 amino acid differed from that which otherwise wasabsolutely conserved at that position in the other viruses.

TABLE 2 Divergent Amino Acids in S.A.AR86 Potentially Related to theAdult Neurovirulence Phenotype Position in S.A.AR86 Conserved S.A.AR86Amino Acid Amino Acid nsP1 583 Thr Ile nsP2 256 Arg Ala 648 Ile Val 651Lys Glu nsP3 344 Gly Glu 386 Tyr Ser 441 Asp Gly 445 Ile Met 537 CysOpal E2 243 Ser Leu 6K 30 Val Ile E1 112 Val Ala 169 Leu Ser

EXAMPLE 5 pS55 Molecular Clone of S.A.AR86

As a first step in investigating the unique adult mouse neurovirulencephenotype of S.A.AR86, a full-length cDNA clone of the S.A.AR86 genomewas constructed. The sources of cDNA included conventional cDNA clones(Davis et al., Virology 171, 189-204 (1989)) as well as uncloned RT-PCRfragments derived from the S.A.AR86 genome. As described previously,these were substituted, starting at the 3′ end, into pTR5000 (McKnightet al., J. Virol. 70, 1981-89 (1996)), a full-length Sindbis clone fromwhich infectious genomic replicas could be derived by transcription withSP6 polymerase in vitro.

The end result was pS55, a molecular clone of S.A.AR86 from whichinfectious transcripts could be produced and which contained fournucleotide changes (G for A at nt 215; G for C at nt 3863; G for A at nt5984; and C for T at nt 9113) but no amino acid coding differences withrespect to the S.A.AR86 genomic RNA (amino acid sequence of S.A.AR86presented in SEQ ID NO:2 and SEQ ID NO:3. The nucleotide sequence ofclone pS55 is presented in SEQ ID NO:7.

As has been described by Simpson et al., Virology 222, 464-69 (1996),neurovirulence and replication of the virus derived from pS55 (S55) werecompared with those of S.A.AR86. It was found that S55 exhibits thedistinctive adult neurovirulence characteristic of S.A.AR86. LikeS.A.AR86, S55 produces 100% mortality in adult mice infected with thevirus and the survival times of animals infected with both viruses wereindistinguishable. In addition, S55 and S.A.AR86 were found to replicateto essentially equivalent titers in vivo, and the profiles of S55 andS.A.AR86 virus growth in the central nervous system and periphery werevery similar.

From these data it was concluded that the silent changes found in virusderived from clone pS55 had little or no effect on its growth orvirulence, and that this molecularly cloned virus accurately representsthe biological isolate, S.A.AR86.

EXAMPLE 6 Construction of the Consensus AR339 Virus TR339

The consensus sequence of the Sindbis virus AR339 isolate, the prototypealphavirus was deduced. The consensus AR339 sequence was inferred bycomparison of the TRSB sequence (a laboratory-derived AR339 strain) withthe complete or partial sequences of HR_(sp) (the Gen Bank sequence;Strauss et al., Virology 133, 92-110 (1984)), SV1A, and NSV(AR339-derived laboratory strains; Lustig et al., J. Virol 62, 2329-36(1988)), and SIN (a laboratory-derived AR339 strain; Davis et al.,Virology 161, 101-108 (1987), Strauss et al., J. Virol. 65, 4654-64(1991)). Each of these viruses was descended from AR339. Where thesesequences differed from each other, they also were compared with theamino acid sequences of other viruses related to Sindbis virus:Ockelbo82, S.A.AR86, Girdwood S.A., and the somewhat more distantlyrelated Aura virus. Rumenapf et al., Virology 208, 621-33 (1995).

The details of determining a consensus AR339 sequence and constructingthe consensus virus TR339 have been described elsewhere. McKnight etal., J. Virol. 70, 1981-89 (1996); Klimstra et al., manuscript inpreparation. The nucleotide sequence of pTR339 is presented SEQ ID NO:8.The deduced amino acid sequences of the pTR339 non-structural andstructural polyproteins are shown as SEQ ID NO:9 and SEQ ID NO:10,respectively. Referring to SEQ ID NO:8, nucleotides 1 through 59represent the 5′ UTR, the non-structural polyprotein is encoded bynucleotides 60 through 7598 (nsP1--nt60 through nt1679; nsP2--nt1680through nt4099; nsP3--nt4100 through nt5747 or 5768; nsP4--nt5769through nt7598), the structural polyprotein is encoded by nucleotides7647 through 11381 (capsid--nt7647 through nt8438; E3--nt8439 throughnt8630; E2--nt8631 through nt9899; 6K--nt9900 through nt10064;E1--nt10065 through nt11381), and the 3′ UTR is represented bynucleotides 11382 through 11703. There is an opal termination codon atnucleotides 5748 through 5750. Position 1897 in SEQ ID NO:9 indicatesthe position of the opal termination codon in the coding region of thenonstructural polyprotein. The consensus nucleotide sequence divergedfrom the pTRSB sequence at three coding positions (nsP3 528, E2 1, andE1 72). These differences are illustrated in Table 3.

TABLE 3 Amino Acid Differences Between Laboratory Strain TRSB andMolecular Clone TR339 nsP3 528 (nt5683) E2 1 (nt8633) E1 72 (nt10279)TR339 Arg (CGA) Ser (AGC) Ala (GCU) TRSB Gln (CAA) Arg (AGA) Val (GUU)

EXAMPLE 7 Animals Used for In Vivo Localization Studies

Specific pathogen free CD-1 mice were obtained from Charles RiverBreeding Laboratories (Raleigh, N.C.) at 21 days of age and maintainedunder barrier conditions until approximately 37 days of age.Intracerebral (ic) inoculations were performed as previously described,Simpson et al., Virol. 222, 464-49 (1996), with 500 PFU of S51 (anattenuated mutant of S55) or 10³ PFU of S55. Animals inoculatedperipherally were first anesthetized with METOFANE®. Then, 25 μl ofdiluent (PBS, pH 7.2, 1% donor calf serum, 100 u/ml penicillin, 50 μg/mlstreptomycin, 0.9 mM CaCl₂, and 0.5 mM MgCl₂) containing 10³ PFU ofvirus were injected either intravenously (iv) into the tail vein,subcutaneously (sc) into the skin above the shoulder blades on themiddle of the back, or intraperitoneally (ip) in the lower rightabdomen. Animals were sacrificed at various times post-inoculation aspreviously described. Simpson et al., Virol. 222, 464-49 (1996). Brains(including brainstems) were homogenized in diluent to 30% w/v, and rightquadriceps were homogenized in diluent to 25% w/v. Homogenates werehandled and titered as described previously. Simpson et al., Virol. 222,464-49 (1996). Bone marrow was harvested by crushing both femurs fromeach animal in sufficient diluent to produce a 30% w/v suspension(calculated as weight of uncrushed femurs in volume of diluent). Sampleswere stored at −70° C. For titration, samples were thawed and clarifiedby centrifugation at 1,000 x g for 20 minutes at 4° C. before beingtitered by conventional plaque assay on BHK-21 cells.

EXAMPLE 8 Tissue Preparation for In Situ Hybridization Studies

Animals were anesthetized by ip injection of 0.5 ml AVERTIN® at varioustimes post-inoculation followed by perfusion with 60 to 75 ml of 4%paraformaldehyde in PBS (pH 7.2) at a flow rate of 10 ml per minute. Theentire carcass was decalcified for 8 to 10 weeks in 4% parafomaldehydecontaining 8% EDTA in PBS (pH 6.8) at 4° C. This solution was changedtwice during the decalcification period. Selected tissues were cut intoblocks approximately 3 mm thick and placed into biopsy cassettes forparaffin embedding and sectioning. Blocks were embedded, sectioned andhematoxylin/eosin stained by Experimental Pathology Laboratories(Research Triangle Park, N.C.) or North Carolina State UniversityVeterinary School Pathology Laboratory (Raleigh, N.C.).

EXAMPLE 9 In Situ Hybridization

Hybridizations were performed using a [³⁵S]-UTP labeled S.A.AR86specific riboprobe derived from pDS-45. Clone pDS-45 was constructed byfirst amplifying a 707 base pair fragment from pS55 by PCR using primers7241 (5′-CTGCGGCGGATTCATCTTGC-3′, SEQ ID NO:11) and SC-3(5′-CTCCAACTTAAGTG-3′, SEQ ID NO:12). The resulting 707 base pairfragment was purified using a GENE CLEAN® kit (Bio101, Calif.), digestedwith HhaI, and cloned into the SmaI site of pSP72 (Promega). LinearizingpDS-45 with EcoRV and performing an in vitro transcription reaction withSP6 DNA-dependent, RNA polymerase (Promega) in the presence of [³⁵S]-UTPresulted in a riboprobe approximately 500 nucleotides in length of which445 nucleotides were complementary to the S.A.AR86 genome (nucleotides7371 through 7816). A riboprobe specific for the influenza strain PR-8hemagglutinin (HA) gene was used as a control probe to test non-specificbinding. The in situ hybridizations were performed as describedpreviously (Charles et al., Virol. 208, 662-71 (1995)) using 10⁵ cpm ofprobe per slide.

EXAMPLE 10 Replication of S.A.AR86 in Bone Marrow

Three groups of six adult mice each were inoculated peripherally (sc,ip, or iv) with 1200 PFU of S55 (a molecular clone of S.A.AR86) in 25 μlof diluent. Under these conditions, the infection produced no morbidityor mortality. Two mice from each group were anesthetized and sacrificedat 2, 4 and 6 days post-inoculation by exsanguination. The serum, brain(including brainstem), right quadricep, and both femurs were harvestedand titered by plaque assay. Virus was never detected in the quadricepsamples of animals inoculated sc (Table 4). A single animal inoculatedip (two days post-inoculation) and two mice inoculated iv (at four andsix days post-inoculation) had detectable virus in the right quadricep,but the titer was at or just above the limit of detection (6.25 PFU/gtissue). Virus was present sporadically or at low levels in the brainand serum of animals regardless of the route of inoculation. Virus wasdetected in the bone marrow of animals regardless of the route ofinoculation. However, the presence of virus in bone marrow of animalsinoculated sc or ip was more sporadic than animals inoculated iv, wherefive out of six animals had detectable virus. These results suggest thatS55 targets to the bone marrow, especially following iv inoculation.

The level and frequency of virus detected in the serum and musclesuggested that virus detected in the bone marrow was not residual viruscontamination from blood or connective tissue remaining in bone marrowsamples. The following experiment also suggested that virus in bonemarrow was not due to tissue or serum contamination. Mice wereinoculated ic with 1200 PFU of S55 in 25 μl of diluent. Animals weresacrificed at 0.25, 0.5, 1, 1.5, 2, 3, 4, 5, and 6 dayspost-inoculation, and the carcasses were decalcified as described inExample 8. Coronal sections taken at approximately 3 mm intervalsthrough the head, spine (including shoulder area), and hips were probedwith an S55-specific [³⁵S]-UTP labeled riboprobe derived from pDS-45.Positive in situ hybridization signal was detected by one daypost-inoculation in the bone marrow of the skull (data not shown). Weaksignal also was present in some of the chondrocytes of the vertebrae,suggesting that S55 was replicating in these cells as well. Although thefrequency of positive bone marrow cells was low, the signal was veryintense over individual positive cells. This result strongly suggeststhat S55 replicates in vivo in a subset of cells contained in the bonemarrow.

EXAMPLE 11 Other Sindbis Group Viruses

It was of interest to determine if the ability to replicate in the bonemarrow of mice was unique to S55 or was a general feature of otherviruses, both Sindbis and non-Sindbis viruses, in the Sindbis group. Six38-day-old female CD-1 mice were inoculated iv with 25 μl of diluentcontaining 10³ PFU of S55, Ockelbo82, Girdwood S.A., TR339, or TRSB. At2, 4 and 6 days post-inoculation two mice from each group weresacrificed and whole blood, serun, brain (including brainstem), rightquadricep, and both femurs were harvested for virus titration.

The results of this experiment were similar to those with S55. TRSBinfected animals had no virus detectable in serum or whole blood in anyanimal at any time, and with the other viruses tested, no virus wasdetected in the serum or whole blood of any animal beyond two dayspost-inoculation (detection limit, 25 PFU/rnl). Neither TRSB nor TR339was detectable in the brains of infected animals at any timepost-inoculation. S55, Girdwood S.A., and Ockelbo82 were present in thebrains of infected animals sporadically with the titers being at or nearthe 75 PFU/g level of detection. All the tested viruses were foundsporadically at or slightly above the 50 PFU/g detection limit in theright quadricep of infected animals except for a single animal four dayspost-inoculation with TRSB which had nearly 105 PFU/g of virus in itsquadricep.

The frequency at which the different viruses were detected in bonemarrow varied-widely with S55 and Girdwood S.A. being the mostfrequently isolated (five out of six animals) and Ockelbo82 and TRSBbeing the least frequently isolated from bone marrow (one out of sixanimals and two out of six animals, respectively) (Table 4). GirdwoodS.A. and S55 gave nearly identical profiles in all tissues. GirdwoodS.A., unlike S.A.AR86, is not neurovirulent in adult mice (Example 4),suggesting that the adult neurovirulence phenotype is distinct from theability of the virus to replicate efficiently in bone marrow.

TABLE 4 Titers Following IV Inoculation of Virus Tissue Titered BoneMarrow Serum Blood Brain Quadricep Virus Animal Days Post-Inoculation(PFU/g) (PFU/ml) (PFU/ml) (PFU/g) (PFU/g) S55 A 2 1125  N.D.^(a) N.D.N.D. N.D. B 488 50 200 N.D. N.D. A 4 863 N.D. N.D. N.D. 550 B 113 N.D.N.D. 75 N.D. A 6 N.D. N.D. N.D. N.D. 50 B 37.5 N.D. N.D. N.D. N.D. Limitof Detection 37.5 25 25 75 50 TR339 A 2 N.D. N.D. N.D. N.D. N.D. B 150075 700 N.D. ND A 4 1050 N.D. N.D. N.D. N.D. B 1762 N.D. N.D. N.D. 400 A6 N.D. N.D. N.D. N.D. N.D. B N.D. N.D. N.D. N.D. N.D. Limit of Detection37.5 25 25 37.5 50 TRSB A 2 N.D. N.D. N.D. N.D. N.D. B N.D. N.D. N.D.N.D. N.D. A 4 150 N.D. N.D. N.D. 1000 B N.D. N.D. N.D. N.D. 100000 A 6N.D. N.D. N.D. N.D. N.D. B 37.5 N.D. N.D. N.D. N.D. Limit of Detection37.5 25 25 37.5 50 Girdwood S.A. A 2 22000 2325 1450 300 50 B 2500 12002600 N.D. N.D. A 4 788 N.D. N.D. N.D. N.D. B 113 N.D. N.D. 75 N.D. A 6N.D. N.D. N.D. N.D. N.D. B 75 N.D. N.D. 1700 N.D. Limit of Detection37.5 25 25 75 50 Ockelbo82 A 2 N.D. 125 150 N.D. N.D. B N.D. 50 500 N.D.200 A 4 N.D. N.D. N.D. 300 N.D. B 300 N.D. N.D. N.D. N.D. A 6 N.D. N.D.N.D. 100000 N.D. B N.D. N.D. N.D. N.D. N.D. Limit of Detection 37.5 2525 75 50 ^(a)“N.D.” indicates that the virus titers were below the limitof detection.

EXAMPLE 12 Virus Persistence in Bone Marrow

The next step in our investigations was to evaluate the possibility thatS.A.AR86 persisted long-term in bone marrow. S51 is a molecularlycloned, attenuated mutant of S55. S51 differs from S55 by a threoninefor isoleucine substitution at amino acid residue 538 of nsP1 and isattenuated in adult mice inoculated intracerebrally. Like S55, S51targeted to and replicated in the bone marrow of 37-day-old female CD-1mice following ic inoculation. Mice were inoculated ic with 500 PFU ofS51 and sacrificed at 4, 8, 16, and 30 days post-inoculation fordetermination of bone marrow and serum titers. At no timepost-inoculation was virus detected in the serum above the 6.25 PFU/mldetection limit. Virus was detectable in the bone marrow samples of bothanimals sampled at four day post-inoculation and in one animal eightdays post-inoculation (Table 5). No virus was detectable by titration onBHK-21 cells in any of the bone marrow samples beyond eight dayspost-inoculation. These results suggested that the attenuating mutationpresent in S51, which reduces the neurovirulence of the virus, did notimpair acute viral replication in the bone marrow.

It was notable that the plaque size on BHK-21 cells of virus recoveredon day 4 post-inoculation was smaller than the size of plaques producedby the inoculum virus, and that plaques produced from virus recoveredfrom the day 8 post-inoculation samples were even smaller and barelyvisible. This suggests a strong selective pressure in the bone marrowfor virus that is much less efficient in forming plaques on BHK-21cells.

To demonstrate that S51 virus genomes were present in bone marrow cellslong after acute infection, four to six-week-old female CD-1 mice wereinoculated ic with 500 PFU of S51. Three months post-inoculation twoanimals were sacrificed, perfused with paraformaldehyde and decalcifiedas described in Example 8. The heads and hind limbs from these animalswere paraffin embedded, sectioned, and probed with a S.A.AR86 specific[³⁵S]-UTP labeled riboprobe derived from clone pDS45. In situhybridization signal was clearly present in discrete cells of the boneand bone marrow of the legs (data not shown). Furthermore, no in situhybridization signal was detected in an adjacent control section probedwith an influenza virus HA gene specific riboprobe. As the relativesensitivity of in situ hybridization is reduced in decalcified tissues(Peter Charles, personal communication), these cells likely contain arelatively high number of viral sequences, even at three monthspost-inoculation. No in situ hybridization signal was observed inmid-sagital sections of the heads with the S.A.AR86 specific probe,although focal lesions were observed in the brain indicative of theprior acute infection with S51.

TABLE 5 S51 Titers in Bone Marrow Following IC Inoculation of 500 PFUDays Post- Titers (Total PFU/Animal) Limit of Inoculation Animal AAnimal B Detection 4 2100 380 62.5 8 62.5 N.D.^(a) 62.5 16 N.D. N.D.62.5 30 N.D. N.D. 62.5 “N.D.” indicates that the virus titers were belowthe limit of detection.

EXAMPLE 13 Replication of S.A.AR86 within Bone/Joint Tissue of AdultMice

Several old world alphaviruses, including Ross River Virus, Chikungunyavirus, Okelbo82, and S.A.AR86 are associated with acute and persistentarthritis/arthralgia in humans. Molecular clones of several Sindbisgroup viruses, including S.A.AR86, were used to investigate alphavirusreplication within bone/joint tissue.

Following intravenous inoculation of S.A.AR86 into adult CD-1 mice,viral replication was observed in bone/joint tissue, but not surroundingmuscle tissue of the hind limbs. Infectious virus was detectable 24 hrspost-infection; however, viral titer within bone/joint tissue wasmaximal 72 hours post-infection. Fractionation of hind limbs frominfected animals revealed that the hip and knee joints were thepredominant sites of viral replication. Replication within bone/jointtissue appears to be a common trait of Sindbis-group viruses, since thelaboratory strains TR339 and TRSB also replicated within bone/jointtissue. In situ hybridization and S.A.AR86 based double promoter vectorsexpressing green fluorescent protein were used to further localizeS.A.AR86 infected cells within bone/joint tissue. Green fluorescentprotein expression was detected in bone/joint tissue for at least onemonth post-inoculation. These studies demonstrated that cells within theendosteum of synovial joints were the predominant site of S.AAR86replication.

12 11663 base pairs nucleic acid double linear cDNA CDS 60..7559 CDS7608..11342 1 ATTGGCGGCG TAGTACACAC TATTGAATCA AACAGCCGAC CAATTGCACTACCATCACA 59 ATG GAG AAG CCA GTA GTT AAC GTA GAC GTA GAC CCT CAG AGT CCGTTT 107 Met Glu Lys Pro Val Val Asn Val Asp Val Asp Pro Gln Ser Pro Phe1 5 10 15 GTC GTG CAA CTG CAA AAG AGC TTC CCG CAA TTT GAG GTA GTA GCACAG 155 Val Val Gln Leu Gln Lys Ser Phe Pro Gln Phe Glu Val Val Ala Gln20 25 30 CAG GTC ACT CCA AAT GAC CAT GCT AAT GCC AGA GCA TTT TCG CAT CTG203 Gln Val Thr Pro Asn Asp His Ala Asn Ala Arg Ala Phe Ser His Leu 3540 45 GCC AGT AAA CTA ATC GAG CTG GAG GTT CCT ACC ACA GCG ACG ATT TTG251 Ala Ser Lys Leu Ile Glu Leu Glu Val Pro Thr Thr Ala Thr Ile Leu 5055 60 GAC ATA GGC AGC GCA CCG GCT CGT AGA ATG TTT TCC GAG CAC CAG TAC299 Asp Ile Gly Ser Ala Pro Ala Arg Arg Met Phe Ser Glu His Gln Tyr 6570 75 80 CAT TGC GTT TGC CCC ATG CGT AGT CCA GAA GAC CCG GAC CGC ATG ATG347 His Cys Val Cys Pro Met Arg Ser Pro Glu Asp Pro Asp Arg Met Met 8590 95 AAA TAT GCC AGC AAA CTG GCG GAA AAA GCA TGT AAG ATT ACA AAC AAG395 Lys Tyr Ala Ser Lys Leu Ala Glu Lys Ala Cys Lys Ile Thr Asn Lys 100105 110 AAC TTG CAT GAG AAG ATC AAG GAC CTC CGG ACC GTA CTT GAT ACA CCG443 Asn Leu His Glu Lys Ile Lys Asp Leu Arg Thr Val Leu Asp Thr Pro 115120 125 GAT GCT GAA ACG CCA TCA CTC TGC TTC CAC AAC GAT GTT ACC TGC AAC491 Asp Ala Glu Thr Pro Ser Leu Cys Phe His Asn Asp Val Thr Cys Asn 130135 140 ACG CGT GCC GAG TAC TCC GTC ATG CAG GAC GTG TAC ATC AAC GCT CCC539 Thr Arg Ala Glu Tyr Ser Val Met Gln Asp Val Tyr Ile Asn Ala Pro 145150 155 160 GGA ACT ATT TAC CAC CAG GCT ATG AAA GGC GTG CGG ACC CTG TACTGG 587 Gly Thr Ile Tyr His Gln Ala Met Lys Gly Val Arg Thr Leu Tyr Trp165 170 175 ATT GGC TTC GAC ACC ACC CAG TTC ATG TTC TCG GCT ATG GCA GGTTCG 635 Ile Gly Phe Asp Thr Thr Gln Phe Met Phe Ser Ala Met Ala Gly Ser180 185 190 TAC CCT GCA TAC AAC ACC AAC TGG GCC GAC GAA AAA GTC CTT GAAGCG 683 Tyr Pro Ala Tyr Asn Thr Asn Trp Ala Asp Glu Lys Val Leu Glu Ala195 200 205 CGT AAC ATC GGA CTC TGC AGC ACA AAG CTG AGT GAA GGC AGG ACAGGA 731 Arg Asn Ile Gly Leu Cys Ser Thr Lys Leu Ser Glu Gly Arg Thr Gly210 215 220 AAG TTG TCG ATA ATG AGG AAG AAG GAG TTG AAG CCC GGG TCA CGGGTT 779 Lys Leu Ser Ile Met Arg Lys Lys Glu Leu Lys Pro Gly Ser Arg Val225 230 235 240 TAT TTC TCC GTT GGA TCG ACA CTT TAC CCA GAA CAC AGA GCCAGC TTG 827 Tyr Phe Ser Val Gly Ser Thr Leu Tyr Pro Glu His Arg Ala SerLeu 245 250 255 CAG AGC TGG CAT CTT CCA TCG GTG TTC CAC TTG AAA GGA AAGCAG TCG 875 Gln Ser Trp His Leu Pro Ser Val Phe His Leu Lys Gly Lys GlnSer 260 265 270 TAC ACT TGC CGC TGT GAT ACA GTG GTG AGC TGC GAA GGC TACGTA GTG 923 Tyr Thr Cys Arg Cys Asp Thr Val Val Ser Cys Glu Gly Tyr ValVal 275 280 285 AAG AAA ATC ACC ATC AGT CCC GGG ATC ACG GGA GAA ACC GTGGGA TAC 971 Lys Lys Ile Thr Ile Ser Pro Gly Ile Thr Gly Glu Thr Val GlyTyr 290 295 300 GCG GTT ACA AAC AAT AGC GAG GGC TTC TTG CTA TGC AAA GTTACC GAT 1019 Ala Val Thr Asn Asn Ser Glu Gly Phe Leu Leu Cys Lys Val ThrAsp 305 310 315 320 ACA GTA AAA GGA GAA CGG GTA TCG TTC CCC GTG TGC ACGTAT ATC CCG 1067 Thr Val Lys Gly Glu Arg Val Ser Phe Pro Val Cys Thr TyrIle Pro 325 330 335 GCC ACC ATA TGC GAT CAG ATG ACC GGC ATA ATG GCC ACGGAT ATC TCA 1115 Ala Thr Ile Cys Asp Gln Met Thr Gly Ile Met Ala Thr AspIle Ser 340 345 350 CCT GAC GAT GCA CAA AAA CTT CTG GTT GGG CTC AAC CAGCGA ATC GTC 1163 Pro Asp Asp Ala Gln Lys Leu Leu Val Gly Leu Asn Gln ArgIle Val 355 360 365 ATT AAC GGT AAG ACT AAC AGG AAC ACC AAT ACC ATG CAAAAT TAC CTT 1211 Ile Asn Gly Lys Thr Asn Arg Asn Thr Asn Thr Met Gln AsnTyr Leu 370 375 380 CTG CCA ATC ATT GCA CAA GGG TTC AGC AAA TGG GCC AAGGAG CGC AAA 1259 Leu Pro Ile Ile Ala Gln Gly Phe Ser Lys Trp Ala Lys GluArg Lys 385 390 395 400 GAA GAT CTT GAC AAT GAA AAA ATG CTG GGC ACC AGAGAG CGC AAG CTT 1307 Glu Asp Leu Asp Asn Glu Lys Met Leu Gly Thr Arg GluArg Lys Leu 405 410 415 ACA TAT GGC TGC TTG TGG GCG TTT CGC ACT AAG AAAGTG CAC TCG TTC 1355 Thr Tyr Gly Cys Leu Trp Ala Phe Arg Thr Lys Lys ValHis Ser Phe 420 425 430 TAT CGC CCA CCT GGA ACG CAG ACC ATC GTA AAA GTCCCA GCC TCT TTT 1403 Tyr Arg Pro Pro Gly Thr Gln Thr Ile Val Lys Val ProAla Ser Phe 435 440 445 AGC GCT TTC CCC ATG TCA TCC GTA TGG ACT ACC TCTTTG CCC ATG TCG 1451 Ser Ala Phe Pro Met Ser Ser Val Trp Thr Thr Ser LeuPro Met Ser 450 455 460 CTG AGG CAG AAG ATG AAA TTG GCA TTA CAA CCA AAGAAG GAG GAA AAA 1499 Leu Arg Gln Lys Met Lys Leu Ala Leu Gln Pro Lys LysGlu Glu Lys 465 470 475 480 CTG CTG CAA GTC CCG GAG GAA TTA GTT ATG GAGGCC AAG GCT GCT TTC 1547 Leu Leu Gln Val Pro Glu Glu Leu Val Met Glu AlaLys Ala Ala Phe 485 490 495 GAG GAT GCT CAG GAG GAA TCC AGA GCG GAG AAGCTC CGA GAA GCA CTC 1595 Glu Asp Ala Gln Glu Glu Ser Arg Ala Glu Lys LeuArg Glu Ala Leu 500 505 510 CCA CCA TTA GTG GCA GAC AAA GGT ATC GAG GCAGCT GCG GAA GTT GTC 1643 Pro Pro Leu Val Ala Asp Lys Gly Ile Glu Ala AlaAla Glu Val Val 515 520 525 TGC GAA GTG GAG GGG CTC CAG GCG GAC ACC GGAGCA GCA CTC GTC GAA 1691 Cys Glu Val Glu Gly Leu Gln Ala Asp Thr Gly AlaAla Leu Val Glu 530 535 540 ACC CCG CGC GGT CAT GTA AGG ATA ATA CCT CAAGCA AAT GAC CGT ATG 1739 Thr Pro Arg Gly His Val Arg Ile Ile Pro Gln AlaAsn Asp Arg Met 545 550 555 560 ATC GGA CAG TAT ATC GTT GTC TCG CCG ATCTCT GTG CTG AAG AAC GCT 1787 Ile Gly Gln Tyr Ile Val Val Ser Pro Ile SerVal Leu Lys Asn Ala 565 570 575 AAA CTC GCA CCA GCA CAC CCG CTA GCA GACCAG GTT AAG ATC ATA ACG 1835 Lys Leu Ala Pro Ala His Pro Leu Ala Asp GlnVal Lys Ile Ile Thr 580 585 590 CAC TCC GGA AGA TCA GGA AGG TAT GCA GTCGAA CCA TAC GAC GCT AAA 1883 His Ser Gly Arg Ser Gly Arg Tyr Ala Val GluPro Tyr Asp Ala Lys 595 600 605 GTA CTG ATG CCA GCA GGA AGT GCC GTA CCATGG CCA GAA TTC TTA GCA 1931 Val Leu Met Pro Ala Gly Ser Ala Val Pro TrpPro Glu Phe Leu Ala 610 615 620 CTG AGT GAG AGC GCC ACG CTT GTG TAC AACGAA AGA GAG TTT GTG AAC 1979 Leu Ser Glu Ser Ala Thr Leu Val Tyr Asn GluArg Glu Phe Val Asn 625 630 635 640 CGC AAG CTG TAC CAT ATT GCC ATG CACGGT CCC GCT AAG AAT ACA GAA 2027 Arg Lys Leu Tyr His Ile Ala Met His GlyPro Ala Lys Asn Thr Glu 645 650 655 GAG GAG CAG TAC AAG GTT ACA AAG GCAGAG CTC GCA GAA ACA GAG TAC 2075 Glu Glu Gln Tyr Lys Val Thr Lys Ala GluLeu Ala Glu Thr Glu Tyr 660 665 670 GTG TTT GAC GTG GAC AAG AAG CGA TGCGTT AAG AAG GAA GAA GCC TCA 2123 Val Phe Asp Val Asp Lys Lys Arg Cys ValLys Lys Glu Glu Ala Ser 675 680 685 GGA CTT GTC CTT TCG GGA GAA CTG ACCAAC CCG CCC TAT CAC GAA CTA 2171 Gly Leu Val Leu Ser Gly Glu Leu Thr AsnPro Pro Tyr His Glu Leu 690 695 700 GCT CTT GAG GGA CTG AAG ACT CGA CCCGCG GTC CCG TAC AAG GTT GAA 2219 Ala Leu Glu Gly Leu Lys Thr Arg Pro AlaVal Pro Tyr Lys Val Glu 705 710 715 720 ACA ATA GGA GTG ATA GGC ACA CCAGGA TCG GGC AAG TCA GCT ATC ATC 2267 Thr Ile Gly Val Ile Gly Thr Pro GlySer Gly Lys Ser Ala Ile Ile 725 730 735 AAG TCA ACT GTC ACG GCA CGT GATCTT GTT ACC AGC GGA AAG AAA GAA 2315 Lys Ser Thr Val Thr Ala Arg Asp LeuVal Thr Ser Gly Lys Lys Glu 740 745 750 AAC TGC CGC GAA ATT GAG GCC GACGTG CTA CGG CTG AGG GGC ATG CAG 2363 Asn Cys Arg Glu Ile Glu Ala Asp ValLeu Arg Leu Arg Gly Met Gln 755 760 765 ATC ACG TCG AAG ACA GTG GAT TCGGTT ATG CTC AAC GGA TGC CAC AAA 2411 Ile Thr Ser Lys Thr Val Asp Ser ValMet Leu Asn Gly Cys His Lys 770 775 780 GCC GTA GAA GTG CTG TAT GTT GACGAA GCG TTC CGG TGC CAC GCA GGA 2459 Ala Val Glu Val Leu Tyr Val Asp GluAla Phe Arg Cys His Ala Gly 785 790 795 800 GCA CTA CTT GCC TTG ATT GCAATC GTC AGA CCC CGT AAG AAG GTA GTA 2507 Ala Leu Leu Ala Leu Ile Ala IleVal Arg Pro Arg Lys Lys Val Val 805 810 815 CTA TGC GGA GAC CCT AAG CAATGC GGA TTC TTC AAC ATG ATG CAA CTA 2555 Leu Cys Gly Asp Pro Lys Gln CysGly Phe Phe Asn Met Met Gln Leu 820 825 830 AAG GTA CAT TTC AAC CAC CCTGAA AAA GAC ATA TGT ACC AAG ACA TTC 2603 Lys Val His Phe Asn His Pro GluLys Asp Ile Cys Thr Lys Thr Phe 835 840 845 TAC AAG TTT ATC TCC CGA CGTTGC ACA CAG CCA GTC ACG GCT ATT GTA 2651 Tyr Lys Phe Ile Ser Arg Arg CysThr Gln Pro Val Thr Ala Ile Val 850 855 860 TCG ACA CTG CAT TAC GAT GGAAAA ATG AAA ACC ACA AAC CCG TGC AAG 2699 Ser Thr Leu His Tyr Asp Gly LysMet Lys Thr Thr Asn Pro Cys Lys 865 870 875 880 AAG AAC ATC GAA ATC GACATT ACA GGG GCC ACG AAG CCG AAG CCA GGG 2747 Lys Asn Ile Glu Ile Asp IleThr Gly Ala Thr Lys Pro Lys Pro Gly 885 890 895 GAC ATC ATC CTG ACA TGTTTC CGC GGG TGG GTT AAG CAA CTG CAA ATC 2795 Asp Ile Ile Leu Thr Cys PheArg Gly Trp Val Lys Gln Leu Gln Ile 900 905 910 GAC TAT CCC GGA CAT GAGGTA ATG ACA GCC GCG GCC TCA CAA GGG CTA 2843 Asp Tyr Pro Gly His Glu ValMet Thr Ala Ala Ala Ser Gln Gly Leu 915 920 925 ACC AGA AAA GGA GTA TATGCC GTC CGG CAA AAA GTC AAT GAA AAC CCG 2891 Thr Arg Lys Gly Val Tyr AlaVal Arg Gln Lys Val Asn Glu Asn Pro 930 935 940 CTG TAC GCG ATC ACA TCAGAG CAT GTG AAC GTG TTG CTC ACC CGC ACT 2939 Leu Tyr Ala Ile Thr Ser GluHis Val Asn Val Leu Leu Thr Arg Thr 945 950 955 960 GAG GAC AGG CTA GTATGG AAA ACT TTA CAG GGC GAC CCA TGG ATT AAG 2987 Glu Asp Arg Leu Val TrpLys Thr Leu Gln Gly Asp Pro Trp Ile Lys 965 970 975 CAG CTC ACT AAC GTACCT AAA GGA AAT TTT CAG GCC ACC ATC GAG GAC 3035 Gln Leu Thr Asn Val ProLys Gly Asn Phe Gln Ala Thr Ile Glu Asp 980 985 990 TGG GAA GCT GAA CACAAG GGA ATA ATT GCT GCG ATA AAC AGT CCC GCT 3083 Trp Glu Ala Glu His LysGly Ile Ile Ala Ala Ile Asn Ser Pro Ala 995 1000 1005 CCC CGT ACC AATCCG TTC AGC TGC AAG ACT AAC GTT TGC TGG GCG AAA 3131 Pro Arg Thr Asn ProPhe Ser Cys Lys Thr Asn Val Cys Trp Ala Lys 1010 1015 1020 GCA CTG GAACCG ATA CTG GCC ACG GCC GGT ATC GTA CTT ACC GGT TGC 3179 Ala Leu Glu ProIle Leu Ala Thr Ala Gly Ile Val Leu Thr Gly Cys 1025 1030 1035 1040 CAGTGG AGC GAG CTG TTC CCA CAG TTT GCG GAT GAC AAA CCA CAC TCG 3227 Gln TrpSer Glu Leu Phe Pro Gln Phe Ala Asp Asp Lys Pro His Ser 1045 1050 1055GCC ATC TAC GCC TTA GAC GTA ATT TGC ATT AAG TTT TTC GGC ATG GAC 3275 AlaIle Tyr Ala Leu Asp Val Ile Cys Ile Lys Phe Phe Gly Met Asp 1060 10651070 TTG ACA AGC GGG CTG TTT TCC AAA CAG AGC ATC CCG TTA ACG TAC CAT3323 Leu Thr Ser Gly Leu Phe Ser Lys Gln Ser Ile Pro Leu Thr Tyr His1075 1080 1085 CCT GCC GAC TCA GCG AGG CCA GTA GCT CAT TGG GAC AAC AGCCCA GGA 3371 Pro Ala Asp Ser Ala Arg Pro Val Ala His Trp Asp Asn Ser ProGly 1090 1095 1100 ACA CGC AAG TAT GGG TAC GAT CAC GCC GTT GCC GCC GAACTC TCC CGT 3419 Thr Arg Lys Tyr Gly Tyr Asp His Ala Val Ala Ala Glu LeuSer Arg 1105 1110 1115 1120 AGA TTT CCG GTG TTC CAG CTA GCT GGG AAA GGCACA CAG CTT GAT TTG 3467 Arg Phe Pro Val Phe Gln Leu Ala Gly Lys Gly ThrGln Leu Asp Leu 1125 1130 1135 CAG ACG GGC AGA ACT AGA GTT ATC TCT GCACAG CAT AAC TTG GTC CCA 3515 Gln Thr Gly Arg Thr Arg Val Ile Ser Ala GlnHis Asn Leu Val Pro 1140 1145 1150 GTG AAC CGC AAT CTC CCT CAC GCC TTAGTC CCC GAG CAC AAG GAG AAA 3563 Val Asn Arg Asn Leu Pro His Ala Leu ValPro Glu His Lys Glu Lys 1155 1160 1165 CAA CCC GGC CCG GTC GAA AAA TTCTTG AGC CAG TTC AAA CAC CAC TCC 3611 Gln Pro Gly Pro Val Glu Lys Phe LeuSer Gln Phe Lys His His Ser 1170 1175 1180 GTA CTT GTG ATC TCA GAG AAAAAA ATT GAA GCT CCC CAC AAG AGA ATC 3659 Val Leu Val Ile Ser Glu Lys LysIle Glu Ala Pro His Lys Arg Ile 1185 1190 1195 1200 GAA TGG ATC GCC CCGATT GGC ATA GCC GGC GCA GAT AAG AAC TAC AAC 3707 Glu Trp Ile Ala Pro IleGly Ile Ala Gly Ala Asp Lys Asn Tyr Asn 1205 1210 1215 CTG GCT TTC GGGTTT CCG CCG CAG GCA CGG TAC GAC CTG GTG TTC ATC 3755 Leu Ala Phe Gly PhePro Pro Gln Ala Arg Tyr Asp Leu Val Phe Ile 1220 1225 1230 AAT ATT GGAACT AAA TAC AGA AAC CAT CAC TTT CAA CAG TGC GAA GAC 3803 Asn Ile Gly ThrLys Tyr Arg Asn His His Phe Gln Gln Cys Glu Asp 1235 1240 1245 CAC GCGGCG ACC TTG AAA ACC CTT TCG CGT TCG GCC CTG AAC TGC CTT 3851 His Ala AlaThr Leu Lys Thr Leu Ser Arg Ser Ala Leu Asn Cys Leu 1250 1255 1260 AACCCC GGA GGC ACC CTC GTG GTG AAG TCC TAC GGT TAC GCC GAC CGC 3899 Asn ProGly Gly Thr Leu Val Val Lys Ser Tyr Gly Tyr Ala Asp Arg 1265 1270 12751280 AAT AGT GAG GAC GTA GTC ACC GCT CTT GCC AGA AAA TTT GTC AGA GTG3947 Asn Ser Glu Asp Val Val Thr Ala Leu Ala Arg Lys Phe Val Arg Val1285 1290 1295 TCT GCA GCG AGG CCA GAG TGC GTC TCA AGC AAT ACA GAA ATGTAC CTG 3995 Ser Ala Ala Arg Pro Glu Cys Val Ser Ser Asn Thr Glu Met TyrLeu 1300 1305 1310 ATT TTC CGA CAA CTA GAC AAC AGC CGC ACA CGA CAA TTCACC CCG CAT 4043 Ile Phe Arg Gln Leu Asp Asn Ser Arg Thr Arg Gln Phe ThrPro His 1315 1320 1325 CAT TTG AAT TGT GTG ATT TCG TCC GTG TAC GAG GGTACA AGA GAC GGA 4091 His Leu Asn Cys Val Ile Ser Ser Val Tyr Glu Gly ThrArg Asp Gly 1330 1335 1340 GTT GGA GCC GCA CCG TCG TAC CGT ACT AAA AGGGAG AAC ATT GCT GAT 4139 Val Gly Ala Ala Pro Ser Tyr Arg Thr Lys Arg GluAsn Ile Ala Asp 1345 1350 1355 1360 TGT CAA GAG GAA GCA GTT GTC AAT GCAGCC AAT CCA CTG GGC AGA CCA 4187 Cys Gln Glu Glu Ala Val Val Asn Ala AlaAsn Pro Leu Gly Arg Pro 1365 1370 1375 GGA GAA GGA GTC TGC CGT GCC ATCTAT AAA CGT TGG CCG AAC AGT TTC 4235 Gly Glu Gly Val Cys Arg Ala Ile TyrLys Arg Trp Pro Asn Ser Phe 1380 1385 1390 ACC GAT TCA GCC ACA GAG ACAGGT ACC GCA AAA CTG ACT GTG TGC CAA 4283 Thr Asp Ser Ala Thr Glu Thr GlyThr Ala Lys Leu Thr Val Cys Gln 1395 1400 1405 GGA AAG AAA GTG ATC CACGCG GTT GGC CCT GAT TTC CGG AAA CAC CCA 4331 Gly Lys Lys Val Ile His AlaVal Gly Pro Asp Phe Arg Lys His Pro 1410 1415 1420 GAG GCA GAA GCC CTGAAA TTG CTG CAA AAC GCC TAC CAT GCA GTG GCA 4379 Glu Ala Glu Ala Leu LysLeu Leu Gln Asn Ala Tyr His Ala Val Ala 1425 1430 1435 1440 GAC TTA GTAAAT GAA CAT AAT ATC AAG TCT GTC GCC ATC CCA CTG CTA 4427 Asp Leu Val AsnGlu His Asn Ile Lys Ser Val Ala Ile Pro Leu Leu 1445 1450 1455 TCT ACAGGC ATT TAC GCA GCC GGA AAA GAC CGC CTT GAG GTA TCA CTT 4475 Ser Thr GlyIle Tyr Ala Ala Gly Lys Asp Arg Leu Glu Val Ser Leu 1460 1465 1470 AACTGC TTG ACA ACC GCG CTA GAC AGA ACT GAT GCG GAC GTA ACC ATC 4523 Asn CysLeu Thr Thr Ala Leu Asp Arg Thr Asp Ala Asp Val Thr Ile 1475 1480 1485TAC TGC CTG GAT AAG AAG TGG AAG GAA AGA ATC GAC GCG GTG CTC CAA 4571 TyrCys Leu Asp Lys Lys Trp Lys Glu Arg Ile Asp Ala Val Leu Gln 1490 14951500 CTT AAG GAG TCT GTA ACT GAG CTG AAG GAT GAG GAT ATG GAG ATC GAC4619 Leu Lys Glu Ser Val Thr Glu Leu Lys Asp Glu Asp Met Glu Ile Asp1505 1510 1515 1520 GAC GAG TTA GTA TGG ATC CAT CCG GAC AGT TGC CTG AAGGGA AGA AAG 4667 Asp Glu Leu Val Trp Ile His Pro Asp Ser Cys Leu Lys GlyArg Lys 1525 1530 1535 GGA TTC AGT ACT ACA AAA GGA AAG TTG TAT TCG TACTTT GAA GGC ACC 4715 Gly Phe Ser Thr Thr Lys Gly Lys Leu Tyr Ser Tyr PheGlu Gly Thr 1540 1545 1550 AAA TTC CAT CAA GCA GCA AAA GAT ATG GCG GAGATA AAG GTC CTG TTC 4763 Lys Phe His Gln Ala Ala Lys Asp Met Ala Glu IleLys Val Leu Phe 1555 1560 1565 CCA AAT GAC CAG GAA AGC AAC GAA CAA CTGTGT GCC TAC ATA TTG GGG 4811 Pro Asn Asp Gln Glu Ser Asn Glu Gln Leu CysAla Tyr Ile Leu Gly 1570 1575 1580 GAG ACC ATG GAA GCA ATC CGC GAA AAATGC CCG GTC GAC CAC AAC CCG 4859 Glu Thr Met Glu Ala Ile Arg Glu Lys CysPro Val Asp His Asn Pro 1585 1590 1595 1600 TCG TCT AGC CCG CCA AAA ACGCTG CCG TGC CTC TGT ATG TAT GCC ATG 4907 Ser Ser Ser Pro Pro Lys Thr LeuPro Cys Leu Cys Met Tyr Ala Met 1605 1610 1615 ACG CCA GAA AGG GTC CACAGA CTC AGA AGC AAT AAC GTC AAA GAA GTT 4955 Thr Pro Glu Arg Val His ArgLeu Arg Ser Asn Asn Val Lys Glu Val 1620 1625 1630 ACA GTA TGC TCC TCCACC CCC CTT CCA AAG TAC AAA ATC AAG AAT GTT 5003 Thr Val Cys Ser Ser ThrPro Leu Pro Lys Tyr Lys Ile Lys Asn Val 1635 1640 1645 CAG AAG GTT CAGTGC ACA AAA GTA GTC CTG TTT AAC CCG CAT ACC CCC 5051 Gln Lys Val Gln CysThr Lys Val Val Leu Phe Asn Pro His Thr Pro 1650 1655 1660 GCA TTC GTTCCC GCC CGT AAG TAC ATA GAA GCA CCA GAA CAG CCT GCA 5099 Ala Phe Val ProAla Arg Lys Tyr Ile Glu Ala Pro Glu Gln Pro Ala 1665 1670 1675 1680 GCTCCG CCT GCA CAG GCC GAG GAG GCC CCC GGA GTT GTA GCG ACA CCA 5147 Ala ProPro Ala Gln Ala Glu Glu Ala Pro Gly Val Val Ala Thr Pro 1685 1690 1695ACA CCA CCT GCA GCT GAT AAC ACC TCG CTT GAT GTC ACG GAC ATC TCA 5195 ThrPro Pro Ala Ala Asp Asn Thr Ser Leu Asp Val Thr Asp Ile Ser 1700 17051710 CTG GAC ATG GAA GAC AGT AGC GAA GGC TCA CTC TTT TCG AGC TTT AGC5243 Leu Asp Met Glu Asp Ser Ser Glu Gly Ser Leu Phe Ser Ser Phe Ser1715 1720 1725 GGA TCG GAC AAC TAC CGA AGG CAG GTG GTG GTG GCT GAC GTCCAT GCC 5291 Gly Ser Asp Asn Tyr Arg Arg Gln Val Val Val Ala Asp Val HisAla 1730 1735 1740 GTC CAA GAG CCT GCC CCT GTT CCA CCG CCA AGG CTA AAGAAG ATG GCC 5339 Val Gln Glu Pro Ala Pro Val Pro Pro Pro Arg Leu Lys LysMet Ala 1745 1750 1755 1760 CGC CTG GCA GCG GCA AGA ATG CAG GAA GAG CCAACT CCA CCG GCA AGC 5387 Arg Leu Ala Ala Ala Arg Met Gln Glu Glu Pro ThrPro Pro Ala Ser 1765 1770 1775 ACC AGC TCT GCG GAC GAG TCC CTT CAC CTTTCT TTT GAT GGG GTA TCT 5435 Thr Ser Ser Ala Asp Glu Ser Leu His Leu SerPhe Asp Gly Val Ser 1780 1785 1790 ATA TCC TTC GGA TCC CTT TTC GAC GGAGAG ATG GCC CGC TTG GCA GCG 5483 Ile Ser Phe Gly Ser Leu Phe Asp Gly GluMet Ala Arg Leu Ala Ala 1795 1800 1805 GCA CAA CCC CCG GCA AGT ACA TGCCCT ACG GAT GTG CCT ATG TCT TTC 5531 Ala Gln Pro Pro Ala Ser Thr Cys ProThr Asp Val Pro Met Ser Phe 1810 1815 1820 GGA TCG TTT TCC GAC GGA GAGATT GAG GAG TTG AGC CGC AGA GTA ACC 5579 Gly Ser Phe Ser Asp Gly Glu IleGlu Glu Leu Ser Arg Arg Val Thr 1825 1830 1835 1840 GAG TCG GAG CCC GTCCTG TTT GGG TCA TTT GAA CCG GGC GAA GTG AAC 5627 Glu Ser Glu Pro Val LeuPhe Gly Ser Phe Glu Pro Gly Glu Val Asn 1845 1850 1855 TCA ATT ATA TCGTCC CGA TCA GCC GTA TCT TTT CCA CCA CGC AAG CAG 5675 Ser Ile Ile Ser SerArg Ser Ala Val Ser Phe Pro Pro Arg Lys Gln 1860 1865 1870 AGA CGT AGACGC AGG AGC AGG AGG ACC GAA TAC TGT CTA ACC GGG GTA 5723 Arg Arg Arg ArgArg Ser Arg Arg Thr Glu Tyr Cys Leu Thr Gly Val 1875 1880 1885 GGT GGGTAC ATA TTT TCG ACG GAC ACA GGC CCT GGG CAC TTG CAA AAG 5771 Gly Gly TyrIle Phe Ser Thr Asp Thr Gly Pro Gly His Leu Gln Lys 1890 1895 1900 AAGTCC GTT CTG CAG AAC CAG CTT ACA GAA CCG ACC TTG GAG CGC AAT 5819 Lys SerVal Leu Gln Asn Gln Leu Thr Glu Pro Thr Leu Glu Arg Asn 1905 1910 19151920 GTT CTG GAA AGA ATC TAC GCC CCG GTG CTC GAC ACG TCG AAA GAG GAA5867 Val Leu Glu Arg Ile Tyr Ala Pro Val Leu Asp Thr Ser Lys Glu Glu1925 1930 1935 CAG CTC AAA CTC AGG TAC CAG ATG ATG CCC ACC GAA GCC AACAAA AGC 5915 Gln Leu Lys Leu Arg Tyr Gln Met Met Pro Thr Glu Ala Asn LysSer 1940 1945 1950 AGG TAC CAG TCT CGA AAA GTA GAA AAC CAG AAA GCC ATAACC ACT GAG 5963 Arg Tyr Gln Ser Arg Lys Val Glu Asn Gln Lys Ala Ile ThrThr Glu 1955 1960 1965 CGA CTG CTT TCA GGG CTA CGA CTG TAT AAC TCT GCCACA GAT CAG CCA 6011 Arg Leu Leu Ser Gly Leu Arg Leu Tyr Asn Ser Ala ThrAsp Gln Pro 1970 1975 1980 GAA TGC TAT AAG ATC ACC TAC CCG AAA CCA TCGTAT TCC AGC AGT GTA 6059 Glu Cys Tyr Lys Ile Thr Tyr Pro Lys Pro Ser TyrSer Ser Ser Val 1985 1990 1995 2000 CCA GCG AAC TAC TCT GAC CCA AAG TTTGCT GTA GCT GTT TGT AAC AAC 6107 Pro Ala Asn Tyr Ser Asp Pro Lys Phe AlaVal Ala Val Cys Asn Asn 2005 2010 2015 TAT CTG CAT GAG AAT TAC CCG ACGGTA GCA TCT TAT CAG ATC ACC GAC 6155 Tyr Leu His Glu Asn Tyr Pro Thr ValAla Ser Tyr Gln Ile Thr Asp 2020 2025 2030 GAG TAC GAT GCT TAC TTG GATATG GTA GAC GGG ACA GTC GCT TGC CTA 6203 Glu Tyr Asp Ala Tyr Leu Asp MetVal Asp Gly Thr Val Ala Cys Leu 2035 2040 2045 GAT ACT GCA ACT TTT TGCCCC GCC AAG CTT AGA AGT TAC CCG AAA AGA 6251 Asp Thr Ala Thr Phe Cys ProAla Lys Leu Arg Ser Tyr Pro Lys Arg 2050 2055 2060 CAC GAG TAT AGA GCCCCA AAC ATC CGC AGT GCG GTT CCA TCA GCG ATG 6299 His Glu Tyr Arg Ala ProAsn Ile Arg Ser Ala Val Pro Ser Ala Met 2065 2070 2075 2080 CAG AAC ACGTTG CAA AAC GTG CTC ATT GCC GCG ACT AAA AGA AAC TGC 6347 Gln Asn Thr LeuGln Asn Val Leu Ile Ala Ala Thr Lys Arg Asn Cys 2085 2090 2095 AAC GTCACA CAA ATG CGT GAA CTG CCA ACA CTG GAC TCA GCG ACA TTC 6395 Asn Val ThrGln Met Arg Glu Leu Pro Thr Leu Asp Ser Ala Thr Phe 2100 2105 2110 AACGTT GAA TGC TTT CGA AAA TAT GCA TGC AAT GAC GAG TAT TGG GAG 6443 Asn ValGlu Cys Phe Arg Lys Tyr Ala Cys Asn Asp Glu Tyr Trp Glu 2115 2120 2125GAG TTT GCC CGA AAG CCA ATT AGG ATC ACT ACT GAG TTC GTT ACC GCA 6491 GluPhe Ala Arg Lys Pro Ile Arg Ile Thr Thr Glu Phe Val Thr Ala 2130 21352140 TAC GTG GCC AGA CTG AAA GGC CCT AAG GCC GCC GCA CTG TTC GCA AAG6539 Tyr Val Ala Arg Leu Lys Gly Pro Lys Ala Ala Ala Leu Phe Ala Lys2145 2150 2155 2160 ACG CAT AAT TTG GTC CCA TTG CAA GAA GTG CCT ATG GATAGA TTC GTC 6587 Thr His Asn Leu Val Pro Leu Gln Glu Val Pro Met Asp ArgPhe Val 2165 2170 2175 ATG GAC ATG AAA AGA GAC GTG AAA GTT ACA CCT GGCACG AAA CAC ACA 6635 Met Asp Met Lys Arg Asp Val Lys Val Thr Pro Gly ThrLys His Thr 2180 2185 2190 GAA GAA AGA CCG AAA GTA CAA GTG ATA CAA GCCGCA GAA CCC CTG GCG 6683 Glu Glu Arg Pro Lys Val Gln Val Ile Gln Ala AlaGlu Pro Leu Ala 2195 2200 2205 ACC GCT TAC CTA TGC GGG ATC CAC CGG GAGTTA GTG CGC AGG CTT ACA 6731 Thr Ala Tyr Leu Cys Gly Ile His Arg Glu LeuVal Arg Arg Leu Thr 2210 2215 2220 GCC GTT TTG CTA CCC AAC ATT CAC ACGCTC TTT GAC ATG TCG GCG GAG 6779 Ala Val Leu Leu Pro Asn Ile His Thr LeuPhe Asp Met Ser Ala Glu 2225 2230 2235 2240 GAC TTT GAT GCA ATC ATA GCAGAA CAC TTC AAG CAA GGT GAC CCG GTA 6827 Asp Phe Asp Ala Ile Ile Ala GluHis Phe Lys Gln Gly Asp Pro Val 2245 2250 2255 CTG GAG ACG GAT ATC GCCTCG TTC GAC AAA AGC CAA GAC GAC GCT ATG 6875 Leu Glu Thr Asp Ile Ala SerPhe Asp Lys Ser Gln Asp Asp Ala Met 2260 2265 2270 GCG TTA ACC GGC CTGATG ATC TTG GAA GAC CTG GGT GTG GAC CAA CCA 6923 Ala Leu Thr Gly Leu MetIle Leu Glu Asp Leu Gly Val Asp Gln Pro 2275 2280 2285 CTA CTC GAC TTGATC GAG TGC GCC TTT GGA GAA ATA TCA TCC ACC CAT 6971 Leu Leu Asp Leu IleGlu Cys Ala Phe Gly Glu Ile Ser Ser Thr His 2290 2295 2300 CTG CCC ACGGGT ACC CGT TTC AAA TTC GGG GCG ATG ATG AAA TCC GGA 7019 Leu Pro Thr GlyThr Arg Phe Lys Phe Gly Ala Met Met Lys Ser Gly 2305 2310 2315 2320 ATGTTC CTC ACG CTC TTT GTC AAC ACA GTT CTG AAT GTC GTT ATC GCC 7067 Met PheLeu Thr Leu Phe Val Asn Thr Val Leu Asn Val Val Ile Ala 2325 2330 2335AGC AGA GTA TTG GAG GAG CGG CTT AAA ACG TCC AAA TGT GCA GCA TTT 7115 SerArg Val Leu Glu Glu Arg Leu Lys Thr Ser Lys Cys Ala Ala Phe 2340 23452350 ATC GGC GAC GAC AAC ATT ATA CAC GGA GTA GTA TCT GAC AAA GAA ATG7163 Ile Gly Asp Asp Asn Ile Ile His Gly Val Val Ser Asp Lys Glu Met2355 2360 2365 GCT GAG AGG TGT GCC ACC TGG CTC AAC ATG GAG GTT AAG ATCATT GAC 7211 Ala Glu Arg Cys Ala Thr Trp Leu Asn Met Glu Val Lys Ile IleAsp 2370 2375 2380 GCA GTC ATC GGC GAG AGA CCA CCT TAC TTC TGC GGT GGATTC ATC TTG 7259 Ala Val Ile Gly Glu Arg Pro Pro Tyr Phe Cys Gly Gly PheIle Leu 2385 2390 2395 2400 CAA GAT TCG GTT ACC TCC ACA GCG TGT CGC GTGGCG GAC CCC TTG AAA 7307 Gln Asp Ser Val Thr Ser Thr Ala Cys Arg Val AlaAsp Pro Leu Lys 2405 2410 2415 AGG CTG TTT AAG TTG GGT AAA CCG CTC CCAGCC GAC GAT GAG CAA GAC 7355 Arg Leu Phe Lys Leu Gly Lys Pro Leu Pro AlaAsp Asp Glu Gln Asp 2420 2425 2430 GAA GAC AGA AGA CGC GCT CTG CTA GATGAA ACA AAG GCG TGG TTT AGA 7403 Glu Asp Arg Arg Arg Ala Leu Leu Asp GluThr Lys Ala Trp Phe Arg 2435 2440 2445 GTA GGT ATA ACA GAC ACC TTA GCAGTG GCC GTG GCA ACT CGG TAT GAG 7451 Val Gly Ile Thr Asp Thr Leu Ala ValAla Val Ala Thr Arg Tyr Glu 2450 2455 2460 GTA GAC AAC ATC ACA CCT GTCCTG CTG GCA TTG AGA ACT TTT GCC CAG 7499 Val Asp Asn Ile Thr Pro Val LeuLeu Ala Leu Arg Thr Phe Ala Gln 2465 2470 2475 2480 AGC AAA AGA GCA TTTCAA GCC ATC AGA GGG GAA ATA AAG CAT CTC TAC 7547 Ser Lys Arg Ala Phe GlnAla Ile Arg Gly Glu Ile Lys His Leu Tyr 2485 2490 2495 GGT GGT CCT AAATAGTCAGCAT AGTACATTTC ATCTGACTAA TACCACAACA 7599 Gly Gly Pro Lys 2500CCACCACC ATG AAT AGA GGA TTC TTT AAC ATG CTC GGC CGC CGC CCC TTC 7649Met Asn Arg Gly Phe Phe Asn Met Leu Gly Arg Arg Pro Phe 1 5 10 CCA GCCCCC ACT GCC ATG TGG AGG CCG CGG AGA AGG AGG CAG GCG GCC 7697 Pro Ala ProThr Ala Met Trp Arg Pro Arg Arg Arg Arg Gln Ala Ala 15 20 25 30 CCG ATGCCT GCC CGC AAT GGG CTG GCT TCC CAA ATC CAG CAA CTG ACC 7745 Pro Met ProAla Arg Asn Gly Leu Ala Ser Gln Ile Gln Gln Leu Thr 35 40 45 ACA GCC GTCAGT GCC CTA GTC ATT GGA CAG GCA ACT AGA CCT CAA ACC 7793 Thr Ala Val SerAla Leu Val Ile Gly Gln Ala Thr Arg Pro Gln Thr 50 55 60 CCA CGC CCA CGCCCG CCG CCG CGC CAG AAG AAG CAG GCG CCA AAG CAA 7841 Pro Arg Pro Arg ProPro Pro Arg Gln Lys Lys Gln Ala Pro Lys Gln 65 70 75 CCA CCG AAG CCG AAGAAA CCA AAA ACA CAG GAG AAG AAG AAG AAG CAA 7889 Pro Pro Lys Pro Lys LysPro Lys Thr Gln Glu Lys Lys Lys Lys Gln 80 85 90 CCT GCA AAA CCC AAA CCCGGA AAG AGA CAG CGT ATG GCA CTT AAG TTG 7937 Pro Ala Lys Pro Lys Pro GlyLys Arg Gln Arg Met Ala Leu Lys Leu 95 100 105 110 GAG GCC GAC AGA CTGTTC GAC GTC AAA AAT GAG GAC GGA GAT GTC ATC 7985 Glu Ala Asp Arg Leu PheAsp Val Lys Asn Glu Asp Gly Asp Val Ile 115 120 125 GGG CAC GCA CTG GCCATG GAA GGA AAG GTA ATG AAA CCA CTC CAC GTG 8033 Gly His Ala Leu Ala MetGlu Gly Lys Val Met Lys Pro Leu His Val 130 135 140 AAA GGA ACT ATT GACCAC CCT GTG CTA TCA AAG CTC AAA TTC ACC AAG 8081 Lys Gly Thr Ile Asp HisPro Val Leu Ser Lys Leu Lys Phe Thr Lys 145 150 155 TCG TCA GCA TAC GACATG GAG TTC GCA CAG TTG CCG GTC AAC ATG AGA 8129 Ser Ser Ala Tyr Asp MetGlu Phe Ala Gln Leu Pro Val Asn Met Arg 160 165 170 AGT GAG GCG TTC ACCTAC ACC AGT GAA CAC CCT GAA GGG TTC TAC AAC 8177 Ser Glu Ala Phe Thr TyrThr Ser Glu His Pro Glu Gly Phe Tyr Asn 175 180 185 190 TGG CAC CAC GGAGCG GTG CAG TAT AGT GGA GGC AGA TTT ACC ATC CCC 8225 Trp His His Gly AlaVal Gln Tyr Ser Gly Gly Arg Phe Thr Ile Pro 195 200 205 CGC GGA GTA GGAGGC AGA GGA GAC AGT GGT CGT CCG ATT ATG GAT AAC 8273 Arg Gly Val Gly GlyArg Gly Asp Ser Gly Arg Pro Ile Met Asp Asn 210 215 220 TCA GGC CGG GTTGTC GCG ATA GTC CTC GGA GGG GCT GAT GAG GGA ACA 8321 Ser Gly Arg Val ValAla Ile Val Leu Gly Gly Ala Asp Glu Gly Thr 225 230 235 AGA ACC GCC CTTTCG GTC GTC ACC TGG AAT AGC AAA GGG AAG ACA ATC 8369 Arg Thr Ala Leu SerVal Val Thr Trp Asn Ser Lys Gly Lys Thr Ile 240 245 250 AAG ACA ACC CCGGAA GGG ACA GAA GAG TGG TCT GCT GCA CCA CTG GTC 8417 Lys Thr Thr Pro GluGly Thr Glu Glu Trp Ser Ala Ala Pro Leu Val 255 260 265 270 ACG GCC ATGTGC TTG CTT GGA AAC GTG AGC TTC CCA TGC AAT CGC CCG 8465 Thr Ala Met CysLeu Leu Gly Asn Val Ser Phe Pro Cys Asn Arg Pro 275 280 285 CCC ACA TGCTAC ACC CGC GAA CCA TCC AGA GCT CTC GAC ATC CTC GAA 8513 Pro Thr Cys TyrThr Arg Glu Pro Ser Arg Ala Leu Asp Ile Leu Glu 290 295 300 GAG AAC GTGAAC CAC GAG GCC TAC GAC ACC CTG CTC AAC GCC ATA TTG 8561 Glu Asn Val AsnHis Glu Ala Tyr Asp Thr Leu Leu Asn Ala Ile Leu 305 310 315 CGG TGC GGATCG TCC GGC AGA AGT AAA AGA AGC GTC ACT GAC GAC TTT 8609 Arg Cys Gly SerSer Gly Arg Ser Lys Arg Ser Val Thr Asp Asp Phe 320 325 330 ACC TTG ACCAGC CCG TAC TTG GGC ACA TGC TCG TAC TGT CAC CAT ACT 8657 Thr Leu Thr SerPro Tyr Leu Gly Thr Cys Ser Tyr Cys His His Thr 335 340 345 350 GAA CCGTGC TTT AGC CCG ATT AAG ATC GAG CAG GTC TGG GAT GAA GCG 8705 Glu Pro CysPhe Ser Pro Ile Lys Ile Glu Gln Val Trp Asp Glu Ala 355 360 365 GAC GACAAC ACC ATA CGC ATA CAG ACT TCC GCC CAG TTT GGA TAC GAC 8753 Asp Asp AsnThr Ile Arg Ile Gln Thr Ser Ala Gln Phe Gly Tyr Asp 370 375 380 CAA AGCGGA GCA GCA AGC TCA AAT AAG TAC CGC TAC ATG TCG CTC GAG 8801 Gln Ser GlyAla Ala Ser Ser Asn Lys Tyr Arg Tyr Met Ser Leu Glu 385 390 395 CAG GATCAT ACT GTC AAA GAA GGC ACC ATG GAT GAC ATC AAG ATC AGC 8849 Gln Asp HisThr Val Lys Glu Gly Thr Met Asp Asp Ile Lys Ile Ser 400 405 410 ACC TCAGGA CCG TGT AGA AGG CTT AGC TAC AAA GGA TAC TTT CTC CTC 8897 Thr Ser GlyPro Cys Arg Arg Leu Ser Tyr Lys Gly Tyr Phe Leu Leu 415 420 425 430 GCGAAG TGT CCT CCA GGG GAC AGC GTA ACG GTT AGC ATA GCG AGT AGC 8945 Ala LysCys Pro Pro Gly Asp Ser Val Thr Val Ser Ile Ala Ser Ser 435 440 445 AACTCA GCA ACG TCA TGC ACA ATG GCC CGC AAG ATA AAA CCA AAA TTC 8993 Asn SerAla Thr Ser Cys Thr Met Ala Arg Lys Ile Lys Pro Lys Phe 450 455 460 GTGGGA CGG GAA AAA TAT GAC CTA CCT CCC GTT CAC GGT AAG AAG ATT 9041 Val GlyArg Glu Lys Tyr Asp Leu Pro Pro Val His Gly Lys Lys Ile 465 470 475 CCTTGC ACA GTG TAC GAC CGT CTG AAA GAA ACA ACC GCC GGC TAC ATC 9089 Pro CysThr Val Tyr Asp Arg Leu Lys Glu Thr Thr Ala Gly Tyr Ile 480 485 490 ACTATG CAC AGG CCG GGA CCG CAT GCC TAT ACA TCC TAT CTG GAG GAA 9137 Thr MetHis Arg Pro Gly Pro His Ala Tyr Thr Ser Tyr Leu Glu Glu 495 500 505 510TCA TCA GGG AAA GTT TAC GCG AAG CCA CCA TCC GGG AAG AAC ATT ACG 9185 SerSer Gly Lys Val Tyr Ala Lys Pro Pro Ser Gly Lys Asn Ile Thr 515 520 525TAC GAG TGC AAG TGC GGC GAT TAC AAG ACC GGA ACC GTT ACG ACC CGT 9233 TyrGlu Cys Lys Cys Gly Asp Tyr Lys Thr Gly Thr Val Thr Thr Arg 530 535 540ACC GAA ATC ACG GGC TGC ACC GCC ATC AAG CAG TGC GTC GCC TAT AAG 9281 ThrGlu Ile Thr Gly Cys Thr Ala Ile Lys Gln Cys Val Ala Tyr Lys 545 550 555AGC GAC CAA ACG AAG TGG GTC TTC AAC TCG CCG GAC TCG ATC AGA CAC 9329 SerAsp Gln Thr Lys Trp Val Phe Asn Ser Pro Asp Ser Ile Arg His 560 565 570GCC GAC CAC ACG GCC CAA GGG AAA TTG CAT TTG CCT TTC AAG CTG ATC 9377 AlaAsp His Thr Ala Gln Gly Lys Leu His Leu Pro Phe Lys Leu Ile 575 580 585590 CCG AGT ACC TGC ATG GTC CCT GTT GCC CAC GCG CCG AAC GTA GTA CAC 9425Pro Ser Thr Cys Met Val Pro Val Ala His Ala Pro Asn Val Val His 595 600605 GGC TTT AAA CAC ATC AGC CTC CAA TTA GAC ACA GAC CAT CTG ACA TTG 9473Gly Phe Lys His Ile Ser Leu Gln Leu Asp Thr Asp His Leu Thr Leu 610 615620 CTC ACC ACC AGG AGA CTA GGG GCA AAC CCG GAA CCA ACC ACT GAA TGG 9521Leu Thr Thr Arg Arg Leu Gly Ala Asn Pro Glu Pro Thr Thr Glu Trp 625 630635 ATC ATC GGA AAC ACG GTT AGA AAC TTC ACC GTC GAC CGA GAT GGC CTG 9569Ile Ile Gly Asn Thr Val Arg Asn Phe Thr Val Asp Arg Asp Gly Leu 640 645650 GAA TAC ATA TGG GGC AAT CAC GAA CCA GTA AGG GTC TAT GCC CAA GAG 9617Glu Tyr Ile Trp Gly Asn His Glu Pro Val Arg Val Tyr Ala Gln Glu 655 660665 670 TCT GCA CCA GGA GAC CCT CAC GGA TGG CCA CAC GAA ATA GTA CAG CAT9665 Ser Ala Pro Gly Asp Pro His Gly Trp Pro His Glu Ile Val Gln His 675680 685 TAC TAT CAT CGC CAT CCT GTG TAC ACC ATC TTA GCC GTC GCA TCA GCT9713 Tyr Tyr His Arg His Pro Val Tyr Thr Ile Leu Ala Val Ala Ser Ala 690695 700 GCT GTG GCG ATG ATG ATT GGC GTA ACT GTT GCA GCA TTA TGT GCC TGT9761 Ala Val Ala Met Met Ile Gly Val Thr Val Ala Ala Leu Cys Ala Cys 705710 715 AAA GCG CGC CGT GAG TGC CTG ACG CCA TAT GCC CTG GCC CCA AAT GCC9809 Lys Ala Arg Arg Glu Cys Leu Thr Pro Tyr Ala Leu Ala Pro Asn Ala 720725 730 GTG ATT CCA ACT TCG CTG GCA CTT TTG TGC TGT GTT AGG TCG GCT AAT9857 Val Ile Pro Thr Ser Leu Ala Leu Leu Cys Cys Val Arg Ser Ala Asn 735740 745 750 GCT GAA ACA TTC ACC GAG ACC ATG AGT TAC TTA TGG TCG AAC AGCCAG 9905 Ala Glu Thr Phe Thr Glu Thr Met Ser Tyr Leu Trp Ser Asn Ser Gln755 760 765 CCG TTC TTC TGG GTC CAG CTG TGT ATA CCT CTG GCC GCT GTC GTCGTT 9953 Pro Phe Phe Trp Val Gln Leu Cys Ile Pro Leu Ala Ala Val Val Val770 775 780 CTA ATG CGC TGT TGC TCA TGC TGC CTG CCT TTT TTA GTG GTT GCCGGC 10001 Leu Met Arg Cys Cys Ser Cys Cys Leu Pro Phe Leu Val Val AlaGly 785 790 795 GCC TAC CTG GCG AAG GTA GAC GCC TAC GAA CAT GCG ACC ACTGTT CCA 10049 Ala Tyr Leu Ala Lys Val Asp Ala Tyr Glu His Ala Thr ThrVal Pro 800 805 810 AAT GTG CCA CAG ATA CCG TAT AAG GCA CTT GTT GAA AGGGCA GGG TAC 10097 Asn Val Pro Gln Ile Pro Tyr Lys Ala Leu Val Glu ArgAla Gly Tyr 815 820 825 830 GCC CCG CTC AAT TTG GAG ATT ACT GTC ATG TCCTCG GAG GTT TTG CCT 10145 Ala Pro Leu Asn Leu Glu Ile Thr Val Met SerSer Glu Val Leu Pro 835 840 845 TCC ACC AAC CAA GAG TAC ATT ACC TGC AAATTC ACC ACT GTG GTC CCC 10193 Ser Thr Asn Gln Glu Tyr Ile Thr Cys LysPhe Thr Thr Val Val Pro 850 855 860 TCC CCT AAA GTC AGA TGC TGC GGC TCCTTG GAA TGT CAG CCC GCC GCT 10241 Ser Pro Lys Val Arg Cys Cys Gly SerLeu Glu Cys Gln Pro Ala Ala 865 870 875 CAC GCA GAC TAT ACC TGC AAG GTCTTT GGA GGG GTG TAC CCC TTC ATG 10289 His Ala Asp Tyr Thr Cys Lys ValPhe Gly Gly Val Tyr Pro Phe Met 880 885 890 TGG GGA GGA GCA CAA TGT TTTTGC GAC AGT GAG AAC AGC CAG ATG AGT 10337 Trp Gly Gly Ala Gln Cys PheCys Asp Ser Glu Asn Ser Gln Met Ser 895 900 905 910 GAG GCG TAC GTC GAATTG TCA GTA GAT TGC GCG ACT GAC CAC GCG CAG 10385 Glu Ala Tyr Val GluLeu Ser Val Asp Cys Ala Thr Asp His Ala Gln 915 920 925 GCG ATT AAG GTGCAT ACT GCC GCG ATG AAA GTA GGA CTG CGT ATA GTG 10433 Ala Ile Lys ValHis Thr Ala Ala Met Lys Val Gly Leu Arg Ile Val 930 935 940 TAC GGG AACACT ACC AGT TTC CTA GAT GTG TAC GTG AAC GGA GTC ACA 10481 Tyr Gly AsnThr Thr Ser Phe Leu Asp Val Tyr Val Asn Gly Val Thr 945 950 955 CCA GGAACG TCT AAA GAC CTG AAA GTC ATA GCT GGA CCA ATT TCA GCA 10529 Pro GlyThr Ser Lys Asp Leu Lys Val Ile Ala Gly Pro Ile Ser Ala 960 965 970 TTGTTT ACA CCA TTC GAT CAC AAG GTC GTT ATC AAT CGC GGC CTG GTG 10577 LeuPhe Thr Pro Phe Asp His Lys Val Val Ile Asn Arg Gly Leu Val 975 980 985990 TAC AAC TAT GAC TTT CCG GAA TAC GGA GCG ATG AAA CCA GGA GCG TTT10625 Tyr Asn Tyr Asp Phe Pro Glu Tyr Gly Ala Met Lys Pro Gly Ala Phe995 1000 1005 GGA GAC ATT CAA GCT ACC TCC TTG ACT AGC AAA GAC CTC ATCGCC AGC 10673 Gly Asp Ile Gln Ala Thr Ser Leu Thr Ser Lys Asp Leu IleAla Ser 1010 1015 1020 ACA GAC ATT AGG CTA CTC AAG CCT TCC GCC AAG AACGTG CAT GTC CCG 10721 Thr Asp Ile Arg Leu Leu Lys Pro Ser Ala Lys AsnVal His Val Pro 1025 1030 1035 TAC ACG CAG GCC GCA TCT GGA TTC GAG ATGTGG AAA AAC AAC TCA GGC 10769 Tyr Thr Gln Ala Ala Ser Gly Phe Glu MetTrp Lys Asn Asn Ser Gly 1040 1045 1050 CGC CCA CTG CAG GAA ACC GCC CCTTTT GGG TGC AAG ATT GCA GTC AAT 10817 Arg Pro Leu Gln Glu Thr Ala ProPhe Gly Cys Lys Ile Ala Val Asn 1055 1060 1065 1070 CCG CTT CGA GCG GTGGAC TGC TCA TAC GGG AAC ATT CCC ATT TCT ATT 10865 Pro Leu Arg Ala ValAsp Cys Ser Tyr Gly Asn Ile Pro Ile Ser Ile 1075 1080 1085 GAC ATC CCGAAC GCT GCC TTT ATC AGG ACA TCA GAT GCA CCA CTG GTC 10913 Asp Ile ProAsn Ala Ala Phe Ile Arg Thr Ser Asp Ala Pro Leu Val 1090 1095 1100 TCAACA GTC AAA TGT GAT GTC AGT GAG TGC ACT TAT TCA GCG GAC TTC 10961 SerThr Val Lys Cys Asp Val Ser Glu Cys Thr Tyr Ser Ala Asp Phe 1105 11101115 GGA GGG ATG GCT ACC CTG CAG TAT GTA TCC GAC CGC GAA GGA CAA TGC11009 Gly Gly Met Ala Thr Leu Gln Tyr Val Ser Asp Arg Glu Gly Gln Cys1120 1125 1130 CCT GTA CAT TCG CAT TCG AGC ACA GCA ACC CTC CAA GAG TCGACA GTT 11057 Pro Val His Ser His Ser Ser Thr Ala Thr Leu Gln Glu SerThr Val 1135 1140 1145 1150 CAT GTC CTG GAG AAA GGA GCG GTG ACA GTA CACTTC AGC ACC GCG AGC 11105 His Val Leu Glu Lys Gly Ala Val Thr Val HisPhe Ser Thr Ala Ser 1155 1160 1165 CCA CAG GCG AAC TTC ATT GTA TCG CTGTGT GGT AAG AAG ACA ACA TGC 11153 Pro Gln Ala Asn Phe Ile Val Ser LeuCys Gly Lys Lys Thr Thr Cys 1170 1175 1180 AAT GCA GAA TGC AAA CCA CCAGCT GAT CAT ATC GTG AGC ACC CCG CAC 11201 Asn Ala Glu Cys Lys Pro ProAla Asp His Ile Val Ser Thr Pro His 1185 1190 1195 AAA AAT GAC CAA GAATTC CAA GCC GCC ATC TCA AAA ACT TCA TGG AGT 11249 Lys Asn Asp Gln GluPhe Gln Ala Ala Ile Ser Lys Thr Ser Trp Ser 1200 1205 1210 TGG CTG TTTGCC CTT TTC GGC GGC GCC TCG TCG CTA TTA ATT ATA GGA 11297 Trp Leu PheAla Leu Phe Gly Gly Ala Ser Ser Leu Leu Ile Ile Gly 1215 1220 1225 1230CTT ATG ATT TTT GCT TGC AGC ATG ATG CTG ACT AGC ACA CGA AGA 11342 LeuMet Ile Phe Ala Cys Ser Met Met Leu Thr Ser Thr Arg Arg 1235 1240 1245TGACCGCTAC GCCCCAATGA CCCGACCAGC AAAACTCGAT GTACTTCCGA GGAACTGATG 11402TGCATAATGC ATCAGGCTGG TATATTAGAT CCCCGCTTAC CGCGGGCAAT ATAGCAACAC 11462CAAAACTCGA CGTATTTCCG AGGAAGCGCA GTGCATAATG CTGCGCAGTG TTGCCAAATA 11522ATCACTATAT TAACCATTTA TTCAGCGGAC GCCAAAACTC AATGTATTTC TGAGGAAGCA 11582TGGTGCATAA TGCCATGCAG CGTCTGCATA ACTTTTTATT ATTTCTTTTA TTAATCAACA 11642AAATTTTGTT TTTAACATTT C 11663 2500 amino acids amino acid linear protein2 Met Glu Lys Pro Val Val Asn Val Asp Val Asp Pro Gln Ser Pro Phe 1 5 1015 Val Val Gln Leu Gln Lys Ser Phe Pro Gln Phe Glu Val Val Ala Gln 20 2530 Gln Val Thr Pro Asn Asp His Ala Asn Ala Arg Ala Phe Ser His Leu 35 4045 Ala Ser Lys Leu Ile Glu Leu Glu Val Pro Thr Thr Ala Thr Ile Leu 50 5560 Asp Ile Gly Ser Ala Pro Ala Arg Arg Met Phe Ser Glu His Gln Tyr 65 7075 80 His Cys Val Cys Pro Met Arg Ser Pro Glu Asp Pro Asp Arg Met Met 8590 95 Lys Tyr Ala Ser Lys Leu Ala Glu Lys Ala Cys Lys Ile Thr Asn Lys100 105 110 Asn Leu His Glu Lys Ile Lys Asp Leu Arg Thr Val Leu Asp ThrPro 115 120 125 Asp Ala Glu Thr Pro Ser Leu Cys Phe His Asn Asp Val ThrCys Asn 130 135 140 Thr Arg Ala Glu Tyr Ser Val Met Gln Asp Val Tyr IleAsn Ala Pro 145 150 155 160 Gly Thr Ile Tyr His Gln Ala Met Lys Gly ValArg Thr Leu Tyr Trp 165 170 175 Ile Gly Phe Asp Thr Thr Gln Phe Met PheSer Ala Met Ala Gly Ser 180 185 190 Tyr Pro Ala Tyr Asn Thr Asn Trp AlaAsp Glu Lys Val Leu Glu Ala 195 200 205 Arg Asn Ile Gly Leu Cys Ser ThrLys Leu Ser Glu Gly Arg Thr Gly 210 215 220 Lys Leu Ser Ile Met Arg LysLys Glu Leu Lys Pro Gly Ser Arg Val 225 230 235 240 Tyr Phe Ser Val GlySer Thr Leu Tyr Pro Glu His Arg Ala Ser Leu 245 250 255 Gln Ser Trp HisLeu Pro Ser Val Phe His Leu Lys Gly Lys Gln Ser 260 265 270 Tyr Thr CysArg Cys Asp Thr Val Val Ser Cys Glu Gly Tyr Val Val 275 280 285 Lys LysIle Thr Ile Ser Pro Gly Ile Thr Gly Glu Thr Val Gly Tyr 290 295 300 AlaVal Thr Asn Asn Ser Glu Gly Phe Leu Leu Cys Lys Val Thr Asp 305 310 315320 Thr Val Lys Gly Glu Arg Val Ser Phe Pro Val Cys Thr Tyr Ile Pro 325330 335 Ala Thr Ile Cys Asp Gln Met Thr Gly Ile Met Ala Thr Asp Ile Ser340 345 350 Pro Asp Asp Ala Gln Lys Leu Leu Val Gly Leu Asn Gln Arg IleVal 355 360 365 Ile Asn Gly Lys Thr Asn Arg Asn Thr Asn Thr Met Gln AsnTyr Leu 370 375 380 Leu Pro Ile Ile Ala Gln Gly Phe Ser Lys Trp Ala LysGlu Arg Lys 385 390 395 400 Glu Asp Leu Asp Asn Glu Lys Met Leu Gly ThrArg Glu Arg Lys Leu 405 410 415 Thr Tyr Gly Cys Leu Trp Ala Phe Arg ThrLys Lys Val His Ser Phe 420 425 430 Tyr Arg Pro Pro Gly Thr Gln Thr IleVal Lys Val Pro Ala Ser Phe 435 440 445 Ser Ala Phe Pro Met Ser Ser ValTrp Thr Thr Ser Leu Pro Met Ser 450 455 460 Leu Arg Gln Lys Met Lys LeuAla Leu Gln Pro Lys Lys Glu Glu Lys 465 470 475 480 Leu Leu Gln Val ProGlu Glu Leu Val Met Glu Ala Lys Ala Ala Phe 485 490 495 Glu Asp Ala GlnGlu Glu Ser Arg Ala Glu Lys Leu Arg Glu Ala Leu 500 505 510 Pro Pro LeuVal Ala Asp Lys Gly Ile Glu Ala Ala Ala Glu Val Val 515 520 525 Cys GluVal Glu Gly Leu Gln Ala Asp Thr Gly Ala Ala Leu Val Glu 530 535 540 ThrPro Arg Gly His Val Arg Ile Ile Pro Gln Ala Asn Asp Arg Met 545 550 555560 Ile Gly Gln Tyr Ile Val Val Ser Pro Ile Ser Val Leu Lys Asn Ala 565570 575 Lys Leu Ala Pro Ala His Pro Leu Ala Asp Gln Val Lys Ile Ile Thr580 585 590 His Ser Gly Arg Ser Gly Arg Tyr Ala Val Glu Pro Tyr Asp AlaLys 595 600 605 Val Leu Met Pro Ala Gly Ser Ala Val Pro Trp Pro Glu PheLeu Ala 610 615 620 Leu Ser Glu Ser Ala Thr Leu Val Tyr Asn Glu Arg GluPhe Val Asn 625 630 635 640 Arg Lys Leu Tyr His Ile Ala Met His Gly ProAla Lys Asn Thr Glu 645 650 655 Glu Glu Gln Tyr Lys Val Thr Lys Ala GluLeu Ala Glu Thr Glu Tyr 660 665 670 Val Phe Asp Val Asp Lys Lys Arg CysVal Lys Lys Glu Glu Ala Ser 675 680 685 Gly Leu Val Leu Ser Gly Glu LeuThr Asn Pro Pro Tyr His Glu Leu 690 695 700 Ala Leu Glu Gly Leu Lys ThrArg Pro Ala Val Pro Tyr Lys Val Glu 705 710 715 720 Thr Ile Gly Val IleGly Thr Pro Gly Ser Gly Lys Ser Ala Ile Ile 725 730 735 Lys Ser Thr ValThr Ala Arg Asp Leu Val Thr Ser Gly Lys Lys Glu 740 745 750 Asn Cys ArgGlu Ile Glu Ala Asp Val Leu Arg Leu Arg Gly Met Gln 755 760 765 Ile ThrSer Lys Thr Val Asp Ser Val Met Leu Asn Gly Cys His Lys 770 775 780 AlaVal Glu Val Leu Tyr Val Asp Glu Ala Phe Arg Cys His Ala Gly 785 790 795800 Ala Leu Leu Ala Leu Ile Ala Ile Val Arg Pro Arg Lys Lys Val Val 805810 815 Leu Cys Gly Asp Pro Lys Gln Cys Gly Phe Phe Asn Met Met Gln Leu820 825 830 Lys Val His Phe Asn His Pro Glu Lys Asp Ile Cys Thr Lys ThrPhe 835 840 845 Tyr Lys Phe Ile Ser Arg Arg Cys Thr Gln Pro Val Thr AlaIle Val 850 855 860 Ser Thr Leu His Tyr Asp Gly Lys Met Lys Thr Thr AsnPro Cys Lys 865 870 875 880 Lys Asn Ile Glu Ile Asp Ile Thr Gly Ala ThrLys Pro Lys Pro Gly 885 890 895 Asp Ile Ile Leu Thr Cys Phe Arg Gly TrpVal Lys Gln Leu Gln Ile 900 905 910 Asp Tyr Pro Gly His Glu Val Met ThrAla Ala Ala Ser Gln Gly Leu 915 920 925 Thr Arg Lys Gly Val Tyr Ala ValArg Gln Lys Val Asn Glu Asn Pro 930 935 940 Leu Tyr Ala Ile Thr Ser GluHis Val Asn Val Leu Leu Thr Arg Thr 945 950 955 960 Glu Asp Arg Leu ValTrp Lys Thr Leu Gln Gly Asp Pro Trp Ile Lys 965 970 975 Gln Leu Thr AsnVal Pro Lys Gly Asn Phe Gln Ala Thr Ile Glu Asp 980 985 990 Trp Glu AlaGlu His Lys Gly Ile Ile Ala Ala Ile Asn Ser Pro Ala 995 1000 1005 ProArg Thr Asn Pro Phe Ser Cys Lys Thr Asn Val Cys Trp Ala Lys 1010 10151020 Ala Leu Glu Pro Ile Leu Ala Thr Ala Gly Ile Val Leu Thr Gly Cys1025 1030 1035 1040 Gln Trp Ser Glu Leu Phe Pro Gln Phe Ala Asp Asp LysPro His Ser 1045 1050 1055 Ala Ile Tyr Ala Leu Asp Val Ile Cys Ile LysPhe Phe Gly Met Asp 1060 1065 1070 Leu Thr Ser Gly Leu Phe Ser Lys GlnSer Ile Pro Leu Thr Tyr His 1075 1080 1085 Pro Ala Asp Ser Ala Arg ProVal Ala His Trp Asp Asn Ser Pro Gly 1090 1095 1100 Thr Arg Lys Tyr GlyTyr Asp His Ala Val Ala Ala Glu Leu Ser Arg 1105 1110 1115 1120 Arg PhePro Val Phe Gln Leu Ala Gly Lys Gly Thr Gln Leu Asp Leu 1125 1130 1135Gln Thr Gly Arg Thr Arg Val Ile Ser Ala Gln His Asn Leu Val Pro 11401145 1150 Val Asn Arg Asn Leu Pro His Ala Leu Val Pro Glu His Lys GluLys 1155 1160 1165 Gln Pro Gly Pro Val Glu Lys Phe Leu Ser Gln Phe LysHis His Ser 1170 1175 1180 Val Leu Val Ile Ser Glu Lys Lys Ile Glu AlaPro His Lys Arg Ile 1185 1190 1195 1200 Glu Trp Ile Ala Pro Ile Gly IleAla Gly Ala Asp Lys Asn Tyr Asn 1205 1210 1215 Leu Ala Phe Gly Phe ProPro Gln Ala Arg Tyr Asp Leu Val Phe Ile 1220 1225 1230 Asn Ile Gly ThrLys Tyr Arg Asn His His Phe Gln Gln Cys Glu Asp 1235 1240 1245 His AlaAla Thr Leu Lys Thr Leu Ser Arg Ser Ala Leu Asn Cys Leu 1250 1255 1260Asn Pro Gly Gly Thr Leu Val Val Lys Ser Tyr Gly Tyr Ala Asp Arg 12651270 1275 1280 Asn Ser Glu Asp Val Val Thr Ala Leu Ala Arg Lys Phe ValArg Val 1285 1290 1295 Ser Ala Ala Arg Pro Glu Cys Val Ser Ser Asn ThrGlu Met Tyr Leu 1300 1305 1310 Ile Phe Arg Gln Leu Asp Asn Ser Arg ThrArg Gln Phe Thr Pro His 1315 1320 1325 His Leu Asn Cys Val Ile Ser SerVal Tyr Glu Gly Thr Arg Asp Gly 1330 1335 1340 Val Gly Ala Ala Pro SerTyr Arg Thr Lys Arg Glu Asn Ile Ala Asp 1345 1350 1355 1360 Cys Gln GluGlu Ala Val Val Asn Ala Ala Asn Pro Leu Gly Arg Pro 1365 1370 1375 GlyGlu Gly Val Cys Arg Ala Ile Tyr Lys Arg Trp Pro Asn Ser Phe 1380 13851390 Thr Asp Ser Ala Thr Glu Thr Gly Thr Ala Lys Leu Thr Val Cys Gln1395 1400 1405 Gly Lys Lys Val Ile His Ala Val Gly Pro Asp Phe Arg LysHis Pro 1410 1415 1420 Glu Ala Glu Ala Leu Lys Leu Leu Gln Asn Ala TyrHis Ala Val Ala 1425 1430 1435 1440 Asp Leu Val Asn Glu His Asn Ile LysSer Val Ala Ile Pro Leu Leu 1445 1450 1455 Ser Thr Gly Ile Tyr Ala AlaGly Lys Asp Arg Leu Glu Val Ser Leu 1460 1465 1470 Asn Cys Leu Thr ThrAla Leu Asp Arg Thr Asp Ala Asp Val Thr Ile 1475 1480 1485 Tyr Cys LeuAsp Lys Lys Trp Lys Glu Arg Ile Asp Ala Val Leu Gln 1490 1495 1500 LeuLys Glu Ser Val Thr Glu Leu Lys Asp Glu Asp Met Glu Ile Asp 1505 15101515 1520 Asp Glu Leu Val Trp Ile His Pro Asp Ser Cys Leu Lys Gly ArgLys 1525 1530 1535 Gly Phe Ser Thr Thr Lys Gly Lys Leu Tyr Ser Tyr PheGlu Gly Thr 1540 1545 1550 Lys Phe His Gln Ala Ala Lys Asp Met Ala GluIle Lys Val Leu Phe 1555 1560 1565 Pro Asn Asp Gln Glu Ser Asn Glu GlnLeu Cys Ala Tyr Ile Leu Gly 1570 1575 1580 Glu Thr Met Glu Ala Ile ArgGlu Lys Cys Pro Val Asp His Asn Pro 1585 1590 1595 1600 Ser Ser Ser ProPro Lys Thr Leu Pro Cys Leu Cys Met Tyr Ala Met 1605 1610 1615 Thr ProGlu Arg Val His Arg Leu Arg Ser Asn Asn Val Lys Glu Val 1620 1625 1630Thr Val Cys Ser Ser Thr Pro Leu Pro Lys Tyr Lys Ile Lys Asn Val 16351640 1645 Gln Lys Val Gln Cys Thr Lys Val Val Leu Phe Asn Pro His ThrPro 1650 1655 1660 Ala Phe Val Pro Ala Arg Lys Tyr Ile Glu Ala Pro GluGln Pro Ala 1665 1670 1675 1680 Ala Pro Pro Ala Gln Ala Glu Glu Ala ProGly Val Val Ala Thr Pro 1685 1690 1695 Thr Pro Pro Ala Ala Asp Asn ThrSer Leu Asp Val Thr Asp Ile Ser 1700 1705 1710 Leu Asp Met Glu Asp SerSer Glu Gly Ser Leu Phe Ser Ser Phe Ser 1715 1720 1725 Gly Ser Asp AsnTyr Arg Arg Gln Val Val Val Ala Asp Val His Ala 1730 1735 1740 Val GlnGlu Pro Ala Pro Val Pro Pro Pro Arg Leu Lys Lys Met Ala 1745 1750 17551760 Arg Leu Ala Ala Ala Arg Met Gln Glu Glu Pro Thr Pro Pro Ala Ser1765 1770 1775 Thr Ser Ser Ala Asp Glu Ser Leu His Leu Ser Phe Asp GlyVal Ser 1780 1785 1790 Ile Ser Phe Gly Ser Leu Phe Asp Gly Glu Met AlaArg Leu Ala Ala 1795 1800 1805 Ala Gln Pro Pro Ala Ser Thr Cys Pro ThrAsp Val Pro Met Ser Phe 1810 1815 1820 Gly Ser Phe Ser Asp Gly Glu IleGlu Glu Leu Ser Arg Arg Val Thr 1825 1830 1835 1840 Glu Ser Glu Pro ValLeu Phe Gly Ser Phe Glu Pro Gly Glu Val Asn 1845 1850 1855 Ser Ile IleSer Ser Arg Ser Ala Val Ser Phe Pro Pro Arg Lys Gln 1860 1865 1870 ArgArg Arg Arg Arg Ser Arg Arg Thr Glu Tyr Cys Leu Thr Gly Val 1875 18801885 Gly Gly Tyr Ile Phe Ser Thr Asp Thr Gly Pro Gly His Leu Gln Lys1890 1895 1900 Lys Ser Val Leu Gln Asn Gln Leu Thr Glu Pro Thr Leu GluArg Asn 1905 1910 1915 1920 Val Leu Glu Arg Ile Tyr Ala Pro Val Leu AspThr Ser Lys Glu Glu 1925 1930 1935 Gln Leu Lys Leu Arg Tyr Gln Met MetPro Thr Glu Ala Asn Lys Ser 1940 1945 1950 Arg Tyr Gln Ser Arg Lys ValGlu Asn Gln Lys Ala Ile Thr Thr Glu 1955 1960 1965 Arg Leu Leu Ser GlyLeu Arg Leu Tyr Asn Ser Ala Thr Asp Gln Pro 1970 1975 1980 Glu Cys TyrLys Ile Thr Tyr Pro Lys Pro Ser Tyr Ser Ser Ser Val 1985 1990 1995 2000Pro Ala Asn Tyr Ser Asp Pro Lys Phe Ala Val Ala Val Cys Asn Asn 20052010 2015 Tyr Leu His Glu Asn Tyr Pro Thr Val Ala Ser Tyr Gln Ile ThrAsp 2020 2025 2030 Glu Tyr Asp Ala Tyr Leu Asp Met Val Asp Gly Thr ValAla Cys Leu 2035 2040 2045 Asp Thr Ala Thr Phe Cys Pro Ala Lys Leu ArgSer Tyr Pro Lys Arg 2050 2055 2060 His Glu Tyr Arg Ala Pro Asn Ile ArgSer Ala Val Pro Ser Ala Met 2065 2070 2075 2080 Gln Asn Thr Leu Gln AsnVal Leu Ile Ala Ala Thr Lys Arg Asn Cys 2085 2090 2095 Asn Val Thr GlnMet Arg Glu Leu Pro Thr Leu Asp Ser Ala Thr Phe 2100 2105 2110 Asn ValGlu Cys Phe Arg Lys Tyr Ala Cys Asn Asp Glu Tyr Trp Glu 2115 2120 2125Glu Phe Ala Arg Lys Pro Ile Arg Ile Thr Thr Glu Phe Val Thr Ala 21302135 2140 Tyr Val Ala Arg Leu Lys Gly Pro Lys Ala Ala Ala Leu Phe AlaLys 2145 2150 2155 2160 Thr His Asn Leu Val Pro Leu Gln Glu Val Pro MetAsp Arg Phe Val 2165 2170 2175 Met Asp Met Lys Arg Asp Val Lys Val ThrPro Gly Thr Lys His Thr 2180 2185 2190 Glu Glu Arg Pro Lys Val Gln ValIle Gln Ala Ala Glu Pro Leu Ala 2195 2200 2205 Thr Ala Tyr Leu Cys GlyIle His Arg Glu Leu Val Arg Arg Leu Thr 2210 2215 2220 Ala Val Leu LeuPro Asn Ile His Thr Leu Phe Asp Met Ser Ala Glu 2225 2230 2235 2240 AspPhe Asp Ala Ile Ile Ala Glu His Phe Lys Gln Gly Asp Pro Val 2245 22502255 Leu Glu Thr Asp Ile Ala Ser Phe Asp Lys Ser Gln Asp Asp Ala Met2260 2265 2270 Ala Leu Thr Gly Leu Met Ile Leu Glu Asp Leu Gly Val AspGln Pro 2275 2280 2285 Leu Leu Asp Leu Ile Glu Cys Ala Phe Gly Glu IleSer Ser Thr His 2290 2295 2300 Leu Pro Thr Gly Thr Arg Phe Lys Phe GlyAla Met Met Lys Ser Gly 2305 2310 2315 2320 Met Phe Leu Thr Leu Phe ValAsn Thr Val Leu Asn Val Val Ile Ala 2325 2330 2335 Ser Arg Val Leu GluGlu Arg Leu Lys Thr Ser Lys Cys Ala Ala Phe 2340 2345 2350 Ile Gly AspAsp Asn Ile Ile His Gly Val Val Ser Asp Lys Glu Met 2355 2360 2365 AlaGlu Arg Cys Ala Thr Trp Leu Asn Met Glu Val Lys Ile Ile Asp 2370 23752380 Ala Val Ile Gly Glu Arg Pro Pro Tyr Phe Cys Gly Gly Phe Ile Leu2385 2390 2395 2400 Gln Asp Ser Val Thr Ser Thr Ala Cys Arg Val Ala AspPro Leu Lys 2405 2410 2415 Arg Leu Phe Lys Leu Gly Lys Pro Leu Pro AlaAsp Asp Glu Gln Asp 2420 2425 2430 Glu Asp Arg Arg Arg Ala Leu Leu AspGlu Thr Lys Ala Trp Phe Arg 2435 2440 2445 Val Gly Ile Thr Asp Thr LeuAla Val Ala Val Ala Thr Arg Tyr Glu 2450 2455 2460 Val Asp Asn Ile ThrPro Val Leu Leu Ala Leu Arg Thr Phe Ala Gln 2465 2470 2475 2480 Ser LysArg Ala Phe Gln Ala Ile Arg Gly Glu Ile Lys His Leu Tyr 2485 2490 2495Gly Gly Pro Lys 2500 1245 amino acids amino acid linear protein 3 MetAsn Arg Gly Phe Phe Asn Met Leu Gly Arg Arg Pro Phe Pro Ala 1 5 10 15Pro Thr Ala Met Trp Arg Pro Arg Arg Arg Arg Gln Ala Ala Pro Met 20 25 30Pro Ala Arg Asn Gly Leu Ala Ser Gln Ile Gln Gln Leu Thr Thr Ala 35 40 45Val Ser Ala Leu Val Ile Gly Gln Ala Thr Arg Pro Gln Thr Pro Arg 50 55 60Pro Arg Pro Pro Pro Arg Gln Lys Lys Gln Ala Pro Lys Gln Pro Pro 65 70 7580 Lys Pro Lys Lys Pro Lys Thr Gln Glu Lys Lys Lys Lys Gln Pro Ala 85 9095 Lys Pro Lys Pro Gly Lys Arg Gln Arg Met Ala Leu Lys Leu Glu Ala 100105 110 Asp Arg Leu Phe Asp Val Lys Asn Glu Asp Gly Asp Val Ile Gly His115 120 125 Ala Leu Ala Met Glu Gly Lys Val Met Lys Pro Leu His Val LysGly 130 135 140 Thr Ile Asp His Pro Val Leu Ser Lys Leu Lys Phe Thr LysSer Ser 145 150 155 160 Ala Tyr Asp Met Glu Phe Ala Gln Leu Pro Val AsnMet Arg Ser Glu 165 170 175 Ala Phe Thr Tyr Thr Ser Glu His Pro Glu GlyPhe Tyr Asn Trp His 180 185 190 His Gly Ala Val Gln Tyr Ser Gly Gly ArgPhe Thr Ile Pro Arg Gly 195 200 205 Val Gly Gly Arg Gly Asp Ser Gly ArgPro Ile Met Asp Asn Ser Gly 210 215 220 Arg Val Val Ala Ile Val Leu GlyGly Ala Asp Glu Gly Thr Arg Thr 225 230 235 240 Ala Leu Ser Val Val ThrTrp Asn Ser Lys Gly Lys Thr Ile Lys Thr 245 250 255 Thr Pro Glu Gly ThrGlu Glu Trp Ser Ala Ala Pro Leu Val Thr Ala 260 265 270 Met Cys Leu LeuGly Asn Val Ser Phe Pro Cys Asn Arg Pro Pro Thr 275 280 285 Cys Tyr ThrArg Glu Pro Ser Arg Ala Leu Asp Ile Leu Glu Glu Asn 290 295 300 Val AsnHis Glu Ala Tyr Asp Thr Leu Leu Asn Ala Ile Leu Arg Cys 305 310 315 320Gly Ser Ser Gly Arg Ser Lys Arg Ser Val Thr Asp Asp Phe Thr Leu 325 330335 Thr Ser Pro Tyr Leu Gly Thr Cys Ser Tyr Cys His His Thr Glu Pro 340345 350 Cys Phe Ser Pro Ile Lys Ile Glu Gln Val Trp Asp Glu Ala Asp Asp355 360 365 Asn Thr Ile Arg Ile Gln Thr Ser Ala Gln Phe Gly Tyr Asp GlnSer 370 375 380 Gly Ala Ala Ser Ser Asn Lys Tyr Arg Tyr Met Ser Leu GluGln Asp 385 390 395 400 His Thr Val Lys Glu Gly Thr Met Asp Asp Ile LysIle Ser Thr Ser 405 410 415 Gly Pro Cys Arg Arg Leu Ser Tyr Lys Gly TyrPhe Leu Leu Ala Lys 420 425 430 Cys Pro Pro Gly Asp Ser Val Thr Val SerIle Ala Ser Ser Asn Ser 435 440 445 Ala Thr Ser Cys Thr Met Ala Arg LysIle Lys Pro Lys Phe Val Gly 450 455 460 Arg Glu Lys Tyr Asp Leu Pro ProVal His Gly Lys Lys Ile Pro Cys 465 470 475 480 Thr Val Tyr Asp Arg LeuLys Glu Thr Thr Ala Gly Tyr Ile Thr Met 485 490 495 His Arg Pro Gly ProHis Ala Tyr Thr Ser Tyr Leu Glu Glu Ser Ser 500 505 510 Gly Lys Val TyrAla Lys Pro Pro Ser Gly Lys Asn Ile Thr Tyr Glu 515 520 525 Cys Lys CysGly Asp Tyr Lys Thr Gly Thr Val Thr Thr Arg Thr Glu 530 535 540 Ile ThrGly Cys Thr Ala Ile Lys Gln Cys Val Ala Tyr Lys Ser Asp 545 550 555 560Gln Thr Lys Trp Val Phe Asn Ser Pro Asp Ser Ile Arg His Ala Asp 565 570575 His Thr Ala Gln Gly Lys Leu His Leu Pro Phe Lys Leu Ile Pro Ser 580585 590 Thr Cys Met Val Pro Val Ala His Ala Pro Asn Val Val His Gly Phe595 600 605 Lys His Ile Ser Leu Gln Leu Asp Thr Asp His Leu Thr Leu LeuThr 610 615 620 Thr Arg Arg Leu Gly Ala Asn Pro Glu Pro Thr Thr Glu TrpIle Ile 625 630 635 640 Gly Asn Thr Val Arg Asn Phe Thr Val Asp Arg AspGly Leu Glu Tyr 645 650 655 Ile Trp Gly Asn His Glu Pro Val Arg Val TyrAla Gln Glu Ser Ala 660 665 670 Pro Gly Asp Pro His Gly Trp Pro His GluIle Val Gln His Tyr Tyr 675 680 685 His Arg His Pro Val Tyr Thr Ile LeuAla Val Ala Ser Ala Ala Val 690 695 700 Ala Met Met Ile Gly Val Thr ValAla Ala Leu Cys Ala Cys Lys Ala 705 710 715 720 Arg Arg Glu Cys Leu ThrPro Tyr Ala Leu Ala Pro Asn Ala Val Ile 725 730 735 Pro Thr Ser Leu AlaLeu Leu Cys Cys Val Arg Ser Ala Asn Ala Glu 740 745 750 Thr Phe Thr GluThr Met Ser Tyr Leu Trp Ser Asn Ser Gln Pro Phe 755 760 765 Phe Trp ValGln Leu Cys Ile Pro Leu Ala Ala Val Val Val Leu Met 770 775 780 Arg CysCys Ser Cys Cys Leu Pro Phe Leu Val Val Ala Gly Ala Tyr 785 790 795 800Leu Ala Lys Val Asp Ala Tyr Glu His Ala Thr Thr Val Pro Asn Val 805 810815 Pro Gln Ile Pro Tyr Lys Ala Leu Val Glu Arg Ala Gly Tyr Ala Pro 820825 830 Leu Asn Leu Glu Ile Thr Val Met Ser Ser Glu Val Leu Pro Ser Thr835 840 845 Asn Gln Glu Tyr Ile Thr Cys Lys Phe Thr Thr Val Val Pro SerPro 850 855 860 Lys Val Arg Cys Cys Gly Ser Leu Glu Cys Gln Pro Ala AlaHis Ala 865 870 875 880 Asp Tyr Thr Cys Lys Val Phe Gly Gly Val Tyr ProPhe Met Trp Gly 885 890 895 Gly Ala Gln Cys Phe Cys Asp Ser Glu Asn SerGln Met Ser Glu Ala 900 905 910 Tyr Val Glu Leu Ser Val Asp Cys Ala ThrAsp His Ala Gln Ala Ile 915 920 925 Lys Val His Thr Ala Ala Met Lys ValGly Leu Arg Ile Val Tyr Gly 930 935 940 Asn Thr Thr Ser Phe Leu Asp ValTyr Val Asn Gly Val Thr Pro Gly 945 950 955 960 Thr Ser Lys Asp Leu LysVal Ile Ala Gly Pro Ile Ser Ala Leu Phe 965 970 975 Thr Pro Phe Asp HisLys Val Val Ile Asn Arg Gly Leu Val Tyr Asn 980 985 990 Tyr Asp Phe ProGlu Tyr Gly Ala Met Lys Pro Gly Ala Phe Gly Asp 995 1000 1005 Ile GlnAla Thr Ser Leu Thr Ser Lys Asp Leu Ile Ala Ser Thr Asp 1010 1015 1020Ile Arg Leu Leu Lys Pro Ser Ala Lys Asn Val His Val Pro Tyr Thr 10251030 1035 1040 Gln Ala Ala Ser Gly Phe Glu Met Trp Lys Asn Asn Ser GlyArg Pro 1045 1050 1055 Leu Gln Glu Thr Ala Pro Phe Gly Cys Lys Ile AlaVal Asn Pro Leu 1060 1065 1070 Arg Ala Val Asp Cys Ser Tyr Gly Asn IlePro Ile Ser Ile Asp Ile 1075 1080 1085 Pro Asn Ala Ala Phe Ile Arg ThrSer Asp Ala Pro Leu Val Ser Thr 1090 1095 1100 Val Lys Cys Asp Val SerGlu Cys Thr Tyr Ser Ala Asp Phe Gly Gly 1105 1110 1115 1120 Met Ala ThrLeu Gln Tyr Val Ser Asp Arg Glu Gly Gln Cys Pro Val 1125 1130 1135 HisSer His Ser Ser Thr Ala Thr Leu Gln Glu Ser Thr Val His Val 1140 11451150 Leu Glu Lys Gly Ala Val Thr Val His Phe Ser Thr Ala Ser Pro Gln1155 1160 1165 Ala Asn Phe Ile Val Ser Leu Cys Gly Lys Lys Thr Thr CysAsn Ala 1170 1175 1180 Glu Cys Lys Pro Pro Ala Asp His Ile Val Ser ThrPro His Lys Asn 1185 1190 1195 1200 Asp Gln Glu Phe Gln Ala Ala Ile SerLys Thr Ser Trp Ser Trp Leu 1205 1210 1215 Phe Ala Leu Phe Gly Gly AlaSer Ser Leu Leu Ile Ile Gly Leu Met 1220 1225 1230 Ile Phe Ala Cys SerMet Met Leu Thr Ser Thr Arg Arg 1235 1240 1245 11717 base pairs nucleicacid double linear cDNA 4 NTTGNCGGCG TAGTATACAC TATTGAATCA AACAGCCGACCAATTGCACT ACCATCACA 59 ATG GAG AAG CCA GTA GTT AAC GTA GAC GTA GAC CCGCAG AGT CCG TTT 107 GTC GTG CAA CTG CAA AAG AGC TTC CCG CAA TTT GAG GTAGTA GCA CAG 155 CAG GTC ACT CCA AAT GAC CAT GCT AAT GCC AGA GCA TTT TCGCAT CTG 203 GCC AGT AAA CTA ATC GAG CTG GAG GTT CCT ACC ACA GCG ACG ATTTTG 251 GAC ATA GGC AGC GCA CCG GCT CGT AGA ATG TTT TCC GAG CAC CAG TAC299 CAT TGC GTT TGC CCC ATG CGT AGT CCA GAA GAC CCG GAC CGC ATG ATG 347AAA TAT GCC AGC AAA CTG GCG GAA AAA GCA TGC AAG ATT ACG AAT AAG 395 AACTTG CAT GAG AAG ATC AAG GAC CTC CGG ACC GTA CTT GAT ACA CCG 443 GAT GCTGAA ACG CCA TCA CTC TGC TTC CAC AAC GAT GTT ACC TGC AAC 491 ACG CGT GCCGAG TAC TCC GTC ATG CAG GAC GTG TAC ATC AAC GCT CCC 539 GGA ACT ATT TACCAT CAG GCT ATG AAA GGC GTG CGG ACC CTG TAC TGG 587 ATT GGC TTC GAT ACCACC CAG TTC ATG TTC TCG GCT ATG GCA GGT TCG 635 TAC CCT GCG TAC AAC ACCAAC TGG GCC GAC GAA AAA GTC CTC GAA GCG 683 CGT AAC ATC GGA CTC TGC AGCACA AAG CTG AGT GAA GGC AGG ACA GGA 731 AAG TTG TCG ATA ATG AGG AAG AAGGAG TTG AAG CCC GGG TCA CGG GTT 779 TAT TTC TCC GTT GGA TCG ACA CTT TACCCA GAA CAC AGA GCC AGC TTG 827 CAG AGC TGG CAT CTT CCA TCG GTG TTC CACCTG AAA GGA AAG CAG TCG 875 TAC ACT TGC CGC TGT GAT ACA GTG GTG AGC TGCGAA GGC TAC GTA GTG 923 AAG AAA ATC ACC ATC AGT CCC GGG ATC ACG GGA GAAACC GTG GGA TAC 971 GCG GTT ACA AAC AAT AGC GAG GGC TTC TTG CTA TGC AAAGTT ACC GAT 1019 ACA GTA AAA GGA GAA CGG GTA TCG TTC CCC GTG TGC ACG TATATC CCG 1067 GCC ACC ATA TGC GAT CAG ATG ACC GGC ATA ATG GCC ACG GAT ATCTCA 1115 CCT GAC GAT GCA CAA AAA CTT CTG GTT GGG CTC AAC CAG CGA ATC GTC1163 ATT AAC GGT AAG ACT AAC AGG AAC ACC AAT ACC ATG CAA AAT TAC CTT1211 CTG CCA ATC ATT GCA CAA GGG TTC AGC AAA TGG GCC AAG GAG CGC AAA1259 GAA GAC CTT GAC AAT GAA AAA ATG CTG GGT ACC AGA GAG CGC AAG CTT1307 ACA TAT GGC TGC TTG TGG GCG TTT CGC ACT AAG AAA GTG CAC TCG TTC1355 TAT CGC CCA CCT GGA ACG CAG ACC ATC GTA AAA GTC CCA GCC TCT TTT1403 AGC GCT TTC CCC ATG TCA TCC GTA TGG ACT ACC TCT TTG CCC ATG TCG1451 CTG AGG CAG AAG ATA AAA TTG GCA TTA CAA CCA AAG AAG GAG GAA AAA1499 CTG CTG CAA GTC CCG GAG GAA TTA GTC ATG GAG GCC AAG GCT GCT TTC1547 GAG GAT GCT CAG GAG GAA TCC AGA GCG GAG AAG CTC CGA GAA GCA CTC1595 CCA CCA TTA GTG GCA GAC AAA GGT ATC GAG GCA GCC GCG GAA GTT GTC1643 TGC GAA GTG GAG GGG CTC CAG GCG GAC ATC GGA GCA GCA CTC GTC GAA1691 ACC CCG CGC GGT CAT GTA AGG ATA ATA CCA CAA GCA AAT GAC CGT ATG1739 ATC GGA CAG TAC ATC GTT GTC TCG CCA ACC TCT GTG CTG AAG AAC GCT1787 AAA CTC GCA CCA GCA CAC CCG CTA GCA GAC CAG GTT AAG ATC ATA ACG1835 CAC TCC GGA AGA TCA GGA AGG TAT GCA GTC GAA CCA TAC GAC GCT AAA1883 GTA CTG ATG CCA GCA GGA AGT GCC GTA CCA TGG CCA GAA TTC TTA GCA1931 CTG AGT GAG AGC GCC ACG CTA GTG TAC AAC GAA AGA GAG TTT GTG AAC1979 CGC AAG CTG TAC CAT ATT GCC ATG CAC GGT CCC GCT AAG AAT ACA GAA2027 GAG GAG CAG TAC AAG GTT ACA AAG GCA GAG CTC GCA GAA ACA GAG TAC2075 GTG TTT GAC GTG GAC AAG AAG CGA TGC GTC AAG AAG GAA GAA GCC TCA2123 GGA CTT GTC CTC TCG GGA GAA CTG ACC AAC CCG CCC TAT CAC GAA CTA2171 GCT CTT GAG GGA CTG AAG ACT CGA CCC GTG GTC CCG TAC AAG GTT GAA2219 ACA ATA GGA GTG ATA GGC GCA CCA GGA TCG GGC AAG TCG GCT ATC ATC2267 AAG TCA ACT GTC ACG GCA CGT GAT CTT GTT ACC AGC GGA AAG AAA GAA2315 AAC TGC CGC GAA ATT CAG GCC GAT GTG CTA CGG CTG AGG GGC ATG CAG2363 ATC ACG TCG AAG ACA GTG GAT TCG GTT ATG CTC AAC GGA TGC CGC AAA2411 GCC GTA GAA GTG CTG TAT GTT GAC GAA GCG TTC GCG TGC CAC GCA GGA2459 GCA CTA CTT GCC TTG ATT GCA ATC GTC AGA CCC CGT CAT AAG GTA GTG2507 CTA TGC GGA GAC CCT AAG CAA TGC GGA TTC TTC AAC ATG ATG CAA CTA2555 AAG GTA TAT TTC AAC CAC CCG GAA AAA GAC ATA TGT ACC AAG ACA TTC2603 TAC AAG TTT ATC TCC CGA CGT TGC ACA CAG CCA GTC ACG GCT ATT GTA2651 TCG ACA CTG CAT TAC GAT GGA AAA ATG AAA ACC ACA AAC CCG TGC AAG2699 AAG AAC ATC GAA ATC GAC ATT ACA GGG GCC ACG AAG CCG AAG CCA GGG2747 GAC ATC ATC CTG ACA TGC TTC CGC GGG TGG GTT AAG CAA CTG CAA ATC2795 GAC TAT CCC GGA CAT GAG GTA ATG ACA GCC GCG GCC TCA CAA GGG CTA2843 ACC AGA AAA GGA GTA TAT GCC GTC CGG CAA AAA GTC AAT GAA AAC CCG2891 CTG TAC GCG ATC ACA TCA GAG CAT GTG AAC GTG CTG CTC ACC CGC ACT2939 GAG GAC AGG CTA GTA TGG AAA ACT TTA CAG GGC GAC CCA TGG ATT AAG2987 CAG CTC ACT AAC GTA CCA AAA GGA AAT TTT CAA GCC ACC ATC GAG GAC3035 TGG GAA GCT GAA CAC AAG GGA ATA ATT GCT GCG ATA AAC AGT CCC GCT3083 CCC CGT ACC AAT CCG TTC AGC TGC AAG ACT AAC GTT TGC TGG GCG AAA3131 CGA CTG GAA CCG ATA CTG GCC ACG GCC GGT ATC GTA CTT ACC GGT TGC3179 CAG TGG AGC GAG CTG TTC CCA CAG TTT GCA GAT GAC AAA CCA CAC TCG3227 GCC ATC TAC GCC CTG GAC GTA ATC TGC ATT AAG TTT TTC GGC ATG GAC3275 TTG ACA AGC GGA CTG TTT TCC AAA CAG AGC ATC CCG TTA ACG TAC CAT3323 CCT GCC GAT TCA GCG AGG CCA GTA GCT CAT TGG GAC AAC AGC CCA GGA3371 ACC CGC AAG TAT GGG TAC GAT CAC GCC GTT GCC GCC GAA CTC TCC CGT3419 AGA TTT CCG GTG TTC CAG CTA GCT GGG AAA GGC ACA CAG CTT GAT TTG3467 CAG ACG GGC AGA ACT AGA GTT ATC TCC GCA CAG CAT AAC TTG GTC CCA3515 GTG AAC CGC AAT CTC CCG CAC GCC TTA GTC CCC GAG CAC AAG GAG AAA3563 CAA CCC GGC CCG GTC AAA AAA TTC TTG AGC CAG TTC AAA CAC CAC TCC3611 GTA CTT GTG GTC TCA GAG GAA AAA ATT GAA GCT CCC CAC AAG AGA ATC3659 GAA TGG ATC GCC CCG ATT GGC ATA GCC GGC GCT GAT AAG AAC TAC AAC3707 CTG GCT TTC GGG TTT CCG CCG CAG GCA CGG TAC GAC CTG GTG TTT ATC3755 AAT ATT GGA ACT AAA TAC AGA AAC CAT CAC TTT CAG CAG TGC GAA GAC3803 CAT GCG GCG ACC TTG AAA ACC CTC TCG CGT TCG GCC CTG AAC TGC CTT3851 AAC CCC GGA GGC ACC CTC GTG GTG AAG TCC TAC GGT TAC GCC GAC CGC3899 AAT AGT GAG GAC GTA GTC ACC GCT CTT GCC AGA AAA TTT GTC AGA GTG3947 TCT GCA GCG AGG CCA GAG TGC GTC TCA AGC AAT ACA GAA ATG TAC CTG3995 ATC TTC CGA CAA CTA GAC AAC AGC CGC ACA CGA CAA TTC ACC CCG CAT4043 CAT CTG AAT TGT GTG ATT TCG TCC GTG TAC GAG GGT ACA AGA GAC GGA4091 GTT GGA GCC GCA CCG TCA TAC CGC ACT AAA AGG GAG AAC ATT GCT GAT4139 TGT CAA GAG GAA GCA GTT GTC AAT GCA GCC AAT CCG CTG GGC AGA CCA4187 GGC GAA GGA GTC TGC CGT GCC ATC TAT AAA CGT TGG CCG AAC AGT TTC4235 ACC GAT TCA GCC ACA GAG ACC GGC ACC GCA AAA CTG ACT GTG TGC CAA4283 GGA AAG AAA GTG ATC CAC GCG GTT GGC CCT GAT TTC CGG AAA CAC CCA4331 GAG GCA GAA GCC CTG AAA TTG CTG CAA AAC GCC TAC CAT GCA GTG GCA4379 GAC TTA GTA AAT GAA CAT AAT ATC AAG TCT GTC GCC ATC CCA CTG CTA4427 TCT ACA GGC ATT TAC GCA GCC GGA AAA GAC CGC CTT GAA GTA TCA CTT4475 AAC TGC TTG ACA ACC GCG CTA GAT AGA ACT GAT GCG GAC GTA ACC ATC4523 TAC TGC CTG GAT AAG AAG TGG AAG GAA AGA ATC GAC GCG GTG CTC CAA4571 CTT AAG GAG TCT GTA ATA GAG CTG AAG GAT GAG GAT ATG GAG ATC GAC4619 GAC GAG TTA GTA TGG ATC CAT CCG GAC AGT TGC CTG AAG GGA AGA AAG4667 GGA TTC AGT ACT ACA AAA GGA AAG TTG TAT TCG TAC TTT GAA GGC ACC4715 AAA TTC CAT CAA GCA GCA AAA GAT ATG GCG GAG ATA AAG GTC CTG TTC4763 CCA AAT GAC CAG GAA AGC AAC GAG CAA CTG TGT GCC TAC ATA TTG GGG4811 GAG ACC ATG GAA GCA ATC CGC GAA AAA TGC CCG GTC GAC CAC AAC CCG4859 TCG TCT AGC CCG CCA AAA ACG CTG CCG TGC CTC TGC ATG TAT GCC ATG4907 ACG CCA GAA AGG GTC CAC AGA CTC AGA AGC AAC AAC GTC AAA GAA GTT4955 ACA GTA TGC TCC TCC ACC CCC CTT CCA AAG TAC AAA ATC AAG AAC GTT5003 CAG AAG GTT CAG TGC ACA AAA GTA GTC CTG TTT AAC CCG CAT ACC CCT5051 GCA TTC GTT CCC GCC CGT AAG TAC ATA GAA GCG CCA GAA CAG CCT GCA5099 GCT CCG CCT GCA CAG GCC GAG GAG GCC CCC GAA GTT GCA GCA ACA CCA5147 ACA CCA CCT GCA GCT GAT AAC ACC TCG CTT GAT GTC ACG GAC ATC TCA5195 CTG GAC ATG GAA GAC AGT AGC GAA GGC TCA CTC TTT TCG AGC TTT AGC5243 GGA TCG GAC AAC TCT ATT ACT AGT ATG GAC AGT TGG TCG TCA GGA CCT5291 AGT TCA CTA GAG ATA GTA GAC CGA AGG CAG GTG GTG GTG GCT GAC GTC5339 CAT GCC GTC CAA GAG CCT GCC CCT GTT CCA CCG CCA AGG CTA AAG AAG5387 ATG GCC CGC CTG GCA GCG GCA AGA ATG CAG GAA GAG CCA ACT CCA CCG5435 GCA AGC ACC AGC TCT GCG GAC GAG TCC CTT CAC CTT TCT TTT GGT GGG5483 GTA TCC ATG TCC TTC GGA TCC CTT TTC GAC GGA GAG ATG GGC GCC TTG5531 GCA GCG GCA CAA CCC CCG GCA AGT ACA TGC CCT ACG GAT GTG CCT ATG5579 TCT TTC GGA TCG TTT TCC GAC GGA GAG ATT GAG GAG CTG AGC CGC AGA5627 GTA ACC GAG TCT GAG CCC GTC CTG TTT GGG TCA TTT GAA CCG GGC GAA5675 GTG AAC TCA ATT ATA TCG TCC CGA TCA GTT GTA TCT TTT CCA CCA CGC5723 AAG CAG AGA CGT AGA CGC AGG AGC AGG AGG ACC GAA TAC TGA CTA ACC5771 GGG GTA GGT GGG TAC ATA TTT TCG ACG GAC ACA GGC CCT GGG CAC TTG5819 CAA ATG GAG TCC GTT CTG CAG AAT CAG CTT ACA GAA CCG ACC TTG GAG5867 CGC AAT GTT CTG GAA AGA ATC TAC GCC CCG GTG CTC GAC ACG TCG AAA5915 GAG GAA CAG CTC AAA CTC AGG TAC CAG ATG ATG CCC ACC GAA GCC AAC5963 AAA AGC AGG TAC CAG TCT AGA AAA GTA GAA AAT CAG AAA GCC ATA ACC6011 ACT GAG CGA CTG CTT TCA GGG CTA CGA CTG TAT AAC TCT GCC ACA GAT6059 CAG CCA GAA TGC TAT AAG ATC ACC TAC CCG AAA CCA TCG TAT TCC AGC6107 AGT GTA CCG GCG AAC TAC TCT GAC CCA AAG TTT GCT GTA GCT GTT TGC6155 AAC AAC TAT CTG CAT GAG AAT TAC CCG ACG GTA GCA TCT TAT CAG ATC6203 ACC GAC GAG TAC GAT GCT TAC TTG GAT ATG GTA GAC GGG ACA GTC GCT6251 TGC CTA GAT ACT GCA ACT TTT TGC CCC GCC AAG CTT AGA AGT TAC CCG6299 AAA AGA CAC GAG TAT AGA GCC CCA AAC ACT CGC AGT GCG GTT CCA TCA6347 GCG ATG CAG AAC ACG TTG CAA AAC GTG CTC ATT GCC GCG ACT AAA AGA6395 AAC TGC AAC GTC ACA CAA ATG CGT GAA TTG CCA ACA CTG GAC TCA GCG6443 ACA TTC AAC GTT GAA TGC TTT CGA AAA TAT GCA TGT AAT GAC GAG TAT6491 TGG GAG GAG TTT GCC CGA AAG CCA ATT AGG ATC ACT ACT GAG TTC GTT6539 ACC GCA TAC GTG GCC AGA CTG AAA GGC CCT AAG GCC GCC GCA CTG TTC6587 GCA AAG ACG CAT AAT TTG GTC CCA TTG CAA GAA GTG CCT ATG GAT AGG6635 TTC GTC ATG GAC ATG AAA AGA GAC GTG AAA GTT ACA CCT GGC ACG AAA6683 CAC ACA GAA GAA AGA CCG AAA GTA CAA GTG CTA CAA GCC GCA GAA CCC6731 CTG GCG ACC GCT TAC CTG TGC GGG ATC CAC CGG GAG TTA GTG CGC AGG6779 CTT ACA GCC GTC TTG CTA CCC AAC ATT CAC ACG CTT TTT GAC ATG TCG6827 GCG GAG GAC TTT GAT GCA ATC ATA GCA GAA CAC TTC AAG CAA GGT GAC6875 CCG GTA CTG GAG ACG GAT ATC GCC TCG TTC GAC AAA AGC CAA GAC GAC6923 GCT ATG GCG TTA ACT GGC CTG ATG ATC TTG GAA GAC CTG GGT GTG GAC6971 CAA CCA CTA CTC GAC TTG ATC GAG TGC GCC TTT GGA GAA ATA TCA TCC7019 ACC CAT CTG CCC ACG GGT ACC CGT TTC AAA TTC GGG GCG ATG ATG AAA7067 TCC GGA ATG TTC CTC ACG CTC TTT GTC AAC ACA GTT CTG AAT GTC GTT7115 ATC GCC AGC AGA GTA TTG GAG GAG CGG CTT AAA ACG TCC AAA TGT GCA7163 GCA TTT ATC GGC GAC GAC AAC ATC ATA CAC GGA GTA GTA TCT GAC AAA7211 GAA ATG GCT GAG AGG TGT GCC ACC TGG CTC AAC ATG GAG GTT AAG ATC7259 ATT GAC GCA GTC ATC GGC GAG AGA CCG CCT TAC TTC TGC GGT GGA TTC7307 ATC TTG CAA GAT TCG GTT ACC TCC ACA GCG TGT CGC GTG GCG GAC CCC7355 TTG AAA AGG CTG TTT AAG TTG GGT AAA CCG CTC CCA GCC GAC GAC GAG7403 CAA GAC GAA GAC AGA AGA CGC GCT CTG CTA GAT GAA ACA AAG GCG TGG7451 TTT AGA GTA GGT ATA ACA GAC ACC TTA GCA GTG GCC GTG GCA ACT CGG7499 TAT GAG GTA GAC AAC ATC ACA CCT GTC CTG CTG GCA TTG AGA ACT TTT7547 GCC CAG AGC AAA AGA GCA TTT CAA GCC ATC AGA GGG GAA ATA AAG CAT7595 CTC TAC GGT GGT CCT AAA TAGTCAGCAT AGCACATTTC ATCTGACTAA 7643TACCACAACA CCACCACC ATG AAT AGA GGA TTC TTT AAC ATG CTC GGC CGC 7694 CGCCCC TTC CCG GCC CCC ACT GCC ATG TGG AGG CCG CGG AGA AGG AGG 7742 CAG GCGGCC CCG ATG CCT GCC CGC AAT GGG CTG GCT TCC CAA ATC CAG 7790 CAA CTG ACCACA GCC GTC AGT GCC CTA GTC ATT GGA CAG GCA ACT AGA 7838 CCT CAA ACC CCACGC CCA CGC CCG CCG CCG CGC CAG AAG AAG CAG GCG 7886 CCA AAG CAA CCA CCGAAG CCG AAG AAA CCA AAA ACA CAG GAG AAG AAG 7934 AAG AAG CAA CCT GCA AAACCC AAA CCC GGA AAG AGA CAA CGT ATG GCA 7982 CTC AAG TTG GAG GCC GAC AGACTG TTC GAC GTC AAA AAT GAG GAC GGA 8030 GAT GTC ATC GGG CAC GCA CTG GCCATG GAA GGA AAG GTA ATG AAA CCA 8078 CTC CAC GTG AAA GGA ACT ATT GAC CACCCT GTG CTA TCA AAG CTC AAA 8126 TTC ACC AAG TCG TCA GCA TAC GAC ATG GAGTTC GCA CAG TTG CCG GTC 8174 AAC ATG AGA AGT GAG GCG TTC ACC TAC ACC AGCGAA CAC CCT GAA GGG 8222 TTT TAC AAC TGG CAC CAC GGA GCG GTG CAG TAT AGTGGA GGT AGA TTT 8270 ACC ATC CCC CGC GGA GTA GGA GGC AGA GGA GAC AGT GGTCGT CCG ATT 8318 ATG GAT AAC TCA GGC CGG GTT GTC GCG ATA GTC CTC GGA GGGGCT GAT 8366 GAG GGA ACA AGA ACT GCC CTT TCG GTC GTC ACC TGG AAT AGC AAAGGG 8414 AAG ACA ATC AAG ACA ACC CCG GAA GGG ACA GAA GAG TGG TCT GCA GCA8462 CCA CTG GTC ACG GCC ATG TGC TTG CTT GGA AAC GTG AGC TTC CCA TGC8510 AAT CGC CCG CCC ACA TGC TAC ACC CGC GAA CCA TCC AGA GCT CTT GAC8558 ATC CTT GAA GAG AAC GTG AAC CAC GAG GCC TAC GAC ACC CTG CTC AAC8606 GCC ATA TTG CGG TGC GGA TCG TCC GGC AGA AGC AAA AGA AGC GTC ACT8654 GAC GAC TTT ACC TTG ACC AGC CCG TAC TTG GGC ACA TGC TCG TAC TGT8702 CAC CAT ACT GAA CCG TGC TTT AGC CCG ATT AAG ATC GAG CAG GTC TGG8750 GAT GAA GCG GAC GAC AAC ACC ATA CGC ATA CAG ACT TCC GCC CAG TTT8798 GGA TAC GAC CAA AGC GGA GCA GCA AGC TCA AAT AAG TAC CGC TAC ATG8846 TCG CTC GAG CAG GAT CAT ACC GTC AAA GAA GGC ACT ATG GAT GAC ATC8894 AAG ATC AGC ACC TCA GGA CCG TGT AGA AGG CTT AGC TAC AAA GGA TAC8942 TTT CTC CTC GCG AAG TGT CCT CCA GGG GAC AGC GTA ACG GTT AGT ATA8990 GCG AGT AGC AAC TCA GCA ACG TCA TGC ACA ATG GCC CGC AAG ATA AAA9038 CCA AAA TTC GTG GGA CGG GAA AAA TAT GAC CTA CCT CCC GTT CAC GGT9086 AAG AAG ATT CCT TGC ACA GTG TAC GAC CGT CTG AAA GAA ACA ACC GCC9134 GGC TAC ATC ACT ATG CAC AGG CCG GGA CCG CAC GCC TAT ACG TCC TAT9182 CTG GAG GAA TCA TCA GGG AAA GTC TAC GCG AAG CCA CCA TCC GGA AAG9230 AAC ATT ACG TAC GAG TGC AAG TGC GGC GAT TAC AAG ACC GGT ACC GTT9278 ACG ACC CGT ACC GAA ATC ACG GGC TGC ACC GCC ATC AAG CAG TGC GTC9326 GCC TAT AAG AGC GAC CAA ACG AAG TGG GTC TTC AAT TCG CCG GAC TTG9374 ATC AGA CAT GCC GAC CAC ACG GCC CAA GGG AAA TTG CAT TTA CCT TTC9422 AAG CTG ATC CCG AGT ACC TGC ATG GTC CCT GTT GCC CAC GCG CCG AAC9470 GTA GTA CAC GGC TTT AAA CAC ATC AGC CTC CAA TTA GAC ACA GAC CAC9518 CTG ACA TTG CTC ACC ACC AGG AGA CTA GGG GCA AAT CCG GAA CCA ACT9566 ACT GAA TGG ATC ATC GGA AAG ACG GTT AGA AAC TTC ACC GTC GAC CGA9614 GAT GGC CTG GAA TAC ATA TGG GGC AAT CAC GAA CCG GTA AGG GTC TAT9662 GCC CAA GAG TCT GCA CCA GGA GAC CCT CAC GGA TGG CCA CAC GAA ATA9710 GTA CAG CAT TAC TAC CAT CGC CAT CCT GTG TAC ACC ATC TTA GCC GTC9758 GCA TCA GCT GCT GTG GCG ATG ATG ATT GGC GTA ACT GTT GCA GCA TTA9806 TGT GCC TGT AAA GCG CGC CGT GAG TGC CTG ACG CCA TAT GCC CTG GCC9854 CCA AAT GCC GTG ATT CCA ACT TCG CTG GCA CTT TTG TGC TGT GTT AGG9902 TCG GCT AAT GCT GAA ACA TTC ACC GAG ACC ATG AGT TAC CTA TGG TCG9950 AAC AGC CAG CCA TTC TTC TGG GTC CAG CTG TGT ATA CCC CTG GCC GCT9998 GTC ATC GTT CTA ATG CGC TGT TGC TCA TGC TGC CTG CCT TTT TTA GTG10046 GTT GCC GGC GCC TAC CTG GCG AAG GTA GAC GCC TAC GAA CAT GCG ACC10094 ACT GTT CCA AAT GTG CCA CAG ATA CCG TAT AAG GCA CTT GTT GAA AGG10142 GCA GGG TAC GCC CCG CTC AAT TTG GAG ATT ACT GTC ATG TCC TCG GAG10190 GTT TTG CCT TCC ACC AAC CAA GAG TAC ATC ACC TGC AAA TTC ACC ACT10238 GTG GTC CCC TCC CCT AAA GTC AAA TGC TGC GGC TCC TTG GAA TGT CAG10286 CCC GCC GCT CAC GCA GAC TAT ACC TGC AAG GTC TTT GGA GGG GTG TAC10334 CCC TTC ATG TGG GGA GGA GCA CAA TGT TTT TGC GAC AGT GAG AAC AGC10382 CAG ATG AGT GAG GCG TAC GTC GAA TTG TCA GCA GAT TGC GCG ACT GAC10430 CAC GCG CAG GCG ATT AAG GTG CAT ACT GCC GCG ATG AAA GTA GGA CTA10478 CGT ATA GTG TAC GGG AAC ACT ACC AGT TTC CTA GAT GTG TAC GTG AAC10526 GGA GTC ACA CCA GGA ACG TCT AAA GAC CTG AAA GTC ATA GCT GGA CCA10574 ATT TCA GCA TCG TTT ACA CCA TTC GAT CAC AAG GTC GTT ATC CAT CGC10622 GGC CTG GTG TAC AAC TAT GAC TTC CCG GAA TAC GGA GCG ATG AAA CCA10670 GGA GCG TTT GGA GAC ATT CAA GCT ACC TCC TTG ACT AGC AAA GAT CTC10718 ATC GCC AGC ACA GAC ATT AGA CTA CTC AAG CCT TCC GCC AAG AAC GTG10766 CAT GTC CCG TAC ACG CAG GCC GCA TCT GGA TTC GAG ATG TGG AAA AAC10814 AAC TCA GGC CGC CCA CTG CAG GAA ACC GCC CCT TTC GGG TGC AAG ATT10862 GCA GTC AAT CCG CTT CGA GCG GTG GAC TGC TCA TAC GGG AAC ATT CCC10910 ATC TCT ATC GAC ATC CCG AAC GCT GCC TTT ATC AGG ACA TCA GAT GCA10958 CCA CTG GTC TCA ACA GTC AAA TGT GAT GTC AGT GAG TGC ACT TAC TCA11006 GCG GAC TTC GGC GGG ATG GCT ACC CTG CAG TAT GTA TCC GAC CGC GAA11054 GGA CAA TGC CCT GTA CAT TCG CAT TCG AGC ACA GCA ACC CTC CAA GAG11102 TCG ACA GTT CAT GTC CTG GAG AAA GGA GCG GTG ACA GTA CAC TTC AGC11150 ACC GCG AGC CCA CAG GCG AAC TTT ATT GTA TCG CTG TGT GGT AAG AAG11198 ACA ACA TGC AAT GCA GAA TGC AAA CCA CCA GCT GAC CAT ATC GTG AGC11246 ACC CCG CAC AAA AAT GAC CAA GAA TTC CAA GCC GCC ATC TCA AAA ACT11294 TCA TGG AGT TGG CTG TTT GCC CTT TTC GGC GGC GCC TCG TCG CTA TTA11342 ATT ATA GGA CTT ATG ATT TTT GCT TGC AGC ATG ATG CTG ACT AGC ACA11390 CGA AGA TGACCGCTAC GCCCCAATGA CCCGACCAGC AAAACTCGAT GTACTTCCGA11446 GGAACTGATG TGCATAATGC ATCAGGCTGG TATATTAGAT CCCCGCTTAC CGCGGGCAAT11506 ATAGCAACAC CAAAACTCGA CGTATTTCCG AGGAAGCGCA GTGCATAATG CTGCGCAGTG11566 TTGCCAAATA ATCACTATAT TAACCATTTA TTTAGCGGAC GCCAAAACTC AATGTATTTC11626 TGAGGAAGCA TGGTGCATAA TGCCATGCAG CGTCTGCATA ACTTTTTATT ATTTCTTTTA11686 TTAATCAACA AAATTTTGTT TTTAACATTT N 11717 2517 amino acids aminoacid linear protein 5 Met Glu Lys Pro Val Val Asn Val Asp Val Asp ProGln Ser Pro Phe 1 5 10 15 Val Val Gln Leu Gln Lys Ser Phe Pro Gln PheGlu Val Val Ala Gln 20 25 30 Gln Val Thr Pro Asn Asp His Ala Asn Ala ArgAla Phe Ser His Leu 35 40 45 Ala Ser Lys Leu Ile Glu Leu Glu Val Pro ThrThr Ala Thr Ile Leu 50 55 60 Asp Ile Gly Ser Ala Pro Ala Arg Arg Met PheSer Glu His Gln Tyr 65 70 75 80 His Cys Val Cys Pro Met Arg Ser Pro GluAsp Pro Asp Arg Met Met 85 90 95 Lys Tyr Ala Ser Lys Leu Ala Glu Lys AlaCys Lys Ile Thr Asn Lys 100 105 110 Asn Leu His Glu Lys Ile Lys Asp LeuArg Thr Val Leu Asp Thr Pro 115 120 125 Asp Ala Glu Thr Pro Ser Leu CysPhe His Asn Asp Val Thr Cys Asn 130 135 140 Thr Arg Ala Glu Tyr Ser ValMet Gln Asp Val Tyr Ile Asn Ala Pro 145 150 155 160 Gly Thr Ile Tyr HisGln Ala Met Lys Gly Val Arg Thr Leu Tyr Trp 165 170 175 Ile Gly Phe AspThr Thr Gln Phe Met Phe Ser Ala Met Ala Gly Ser 180 185 190 Tyr Pro AlaTyr Asn Thr Asn Trp Ala Asp Glu Lys Val Leu Glu Ala 195 200 205 Arg AsnIle Gly Leu Cys Ser Thr Lys Leu Ser Glu Gly Arg Thr Gly 210 215 220 LysLeu Ser Ile Met Arg Lys Lys Glu Leu Lys Pro Gly Ser Arg Val 225 230 235240 Tyr Phe Ser Val Gly Ser Thr Leu Tyr Pro Glu His Arg Ala Ser Leu 245250 255 Gln Ser Trp His Leu Pro Ser Val Phe His Leu Lys Gly Lys Gln Ser260 265 270 Tyr Thr Cys Arg Cys Asp Thr Val Val Ser Cys Glu Gly Tyr ValVal 275 280 285 Lys Lys Ile Thr Ile Ser Pro Gly Ile Thr Gly Glu Thr ValGly Tyr 290 295 300 Ala Val Thr Asn Asn Ser Glu Gly Phe Leu Leu Cys LysVal Thr Asp 305 310 315 320 Thr Val Lys Gly Glu Arg Val Ser Phe Pro ValCys Thr Tyr Ile Pro 325 330 335 Ala Thr Ile Cys Asp Gln Met Thr Gly IleMet Ala Thr Asp Ile Ser 340 345 350 Pro Asp Asp Ala Gln Lys Leu Leu ValGly Leu Asn Gln Arg Ile Val 355 360 365 Ile Asn Gly Lys Thr Asn Arg AsnThr Asn Thr Met Gln Asn Tyr Leu 370 375 380 Leu Pro Ile Ile Ala Gln GlyPhe Ser Lys Trp Ala Lys Glu Arg Lys 385 390 395 400 Glu Asp Leu Asp AsnGlu Lys Met Leu Gly Thr Arg Glu Arg Lys Leu 405 410 415 Thr Tyr Gly CysLeu Trp Ala Phe Arg Thr Lys Lys Val His Ser Phe 420 425 430 Tyr Arg ProPro Gly Thr Gln Thr Ile Val Lys Val Pro Ala Ser Phe 435 440 445 Ser AlaPhe Pro Met Ser Ser Val Trp Thr Thr Ser Leu Pro Met Ser 450 455 460 LeuArg Gln Lys Ile Lys Leu Ala Leu Gln Pro Lys Lys Glu Glu Lys 465 470 475480 Leu Leu Gln Val Pro Glu Glu Leu Val Met Glu Ala Lys Ala Ala Phe 485490 495 Glu Asp Ala Gln Glu Glu Ser Arg Ala Glu Lys Leu Arg Glu Ala Leu500 505 510 Pro Pro Leu Val Ala Asp Lys Gly Ile Glu Ala Ala Ala Glu ValVal 515 520 525 Cys Glu Val Glu Gly Leu Gln Ala Asp Ile Gly Ala Ala LeuVal Glu 530 535 540 Thr Pro Arg Gly His Val Arg Ile Ile Pro Gln Ala AsnAsp Arg Met 545 550 555 560 Ile Gly Gln Tyr Ile Val Val Ser Pro Thr SerVal Leu Lys Asn Ala 565 570 575 Lys Leu Ala Pro Ala His Pro Leu Ala AspGln Val Lys Ile Ile Thr 580 585 590 His Ser Gly Arg Ser Gly Arg Tyr AlaVal Glu Pro Tyr Asp Ala Lys 595 600 605 Val Leu Met Pro Ala Gly Ser AlaVal Pro Trp Pro Glu Phe Leu Ala 610 615 620 Leu Ser Glu Ser Ala Thr LeuVal Tyr Asn Glu Arg Glu Phe Val Asn 625 630 635 640 Arg Lys Leu Tyr HisIle Ala Met His Gly Pro Ala Lys Asn Thr Glu 645 650 655 Glu Glu Gln TyrLys Val Thr Lys Ala Glu Leu Ala Glu Thr Glu Tyr 660 665 670 Val Phe AspVal Asp Lys Lys Arg Cys Val Lys Lys Glu Glu Ala Ser 675 680 685 Gly LeuVal Leu Ser Gly Glu Leu Thr Asn Pro Pro Tyr His Glu Leu 690 695 700 AlaLeu Glu Gly Leu Lys Thr Arg Pro Val Val Pro Tyr Lys Val Glu 705 710 715720 Thr Ile Gly Val Ile Gly Ala Pro Gly Ser Gly Lys Ser Ala Ile Ile 725730 735 Lys Ser Thr Val Thr Ala Arg Asp Leu Val Thr Ser Gly Lys Lys Glu740 745 750 Asn Cys Arg Glu Ile Gln Ala Asp Val Leu Arg Leu Arg Gly MetGln 755 760 765 Ile Thr Ser Lys Thr Val Asp Ser Val Met Leu Asn Gly CysArg Lys 770 775 780 Ala Val Glu Val Leu Tyr Val Asp Glu Ala Phe Ala CysHis Ala Gly 785 790 795 800 Ala Leu Leu Ala Leu Ile Ala Ile Val Arg ProArg His Lys Val Val 805 810 815 Leu Cys Gly Asp Pro Lys Gln Cys Gly PhePhe Asn Met Met Gln Leu 820 825 830 Lys Val Tyr Phe Asn His Pro Glu LysAsp Ile Cys Thr Lys Thr Phe 835 840 845 Tyr Lys Phe Ile Ser Arg Arg CysThr Gln Pro Val Thr Ala Ile Val 850 855 860 Ser Thr Leu His Tyr Asp GlyLys Met Lys Thr Thr Asn Pro Cys Lys 865 870 875 880 Lys Asn Ile Glu IleAsp Ile Thr Gly Ala Thr Lys Pro Lys Pro Gly 885 890 895 Asp Ile Ile LeuThr Cys Phe Arg Gly Trp Val Lys Gln Leu Gln Ile 900 905 910 Asp Tyr ProGly His Glu Val Met Thr Ala Ala Ala Ser Gln Gly Leu 915 920 925 Thr ArgLys Gly Val Tyr Ala Val Arg Gln Lys Val Asn Glu Asn Pro 930 935 940 LeuTyr Ala Ile Thr Ser Glu His Val Asn Val Leu Leu Thr Arg Thr 945 950 955960 Glu Asp Arg Leu Val Trp Lys Thr Leu Gln Gly Asp Pro Trp Ile Lys 965970 975 Gln Leu Thr Asn Val Pro Lys Gly Asn Phe Gln Ala Thr Ile Glu Asp980 985 990 Trp Glu Ala Glu His Lys Gly Ile Ile Ala Ala Ile Asn Ser ProAla 995 1000 1005 Pro Arg Thr Asn Pro Phe Ser Cys Lys Thr Asn Val CysTrp Ala Lys 1010 1015 1020 Arg Leu Glu Pro Ile Leu Ala Thr Ala Gly IleVal Leu Thr Gly Cys 1025 1030 1035 1040 Gln Trp Ser Glu Leu Phe Pro GlnPhe Ala Asp Asp Lys Pro His Ser 1045 1050 1055 Ala Ile Tyr Ala Leu AspVal Ile Cys Ile Lys Phe Phe Gly Met Asp 1060 1065 1070 Leu Thr Ser GlyLeu Phe Ser Lys Gln Ser Ile Pro Leu Thr Tyr His 1075 1080 1085 Pro AlaAsp Ser Ala Arg Pro Val Ala His Trp Asp Asn Ser Pro Gly 1090 1095 1100Thr Arg Lys Tyr Gly Tyr Asp His Ala Val Ala Ala Glu Leu Ser Arg 11051110 1115 1120 Arg Phe Pro Val Phe Gln Leu Ala Gly Lys Gly Thr Gln LeuAsp Leu 1125 1130 1135 Gln Thr Gly Arg Thr Arg Val Ile Ser Ala Gln HisAsn Leu Val Pro 1140 1145 1150 Val Asn Arg Asn Leu Pro His Ala Leu ValPro Glu His Lys Glu Lys 1155 1160 1165 Gln Pro Gly Pro Val Lys Lys PheLeu Ser Gln Phe Lys His His Ser 1170 1175 1180 Val Leu Val Val Ser GluGlu Lys Ile Glu Ala Pro His Lys Arg Ile 1185 1190 1195 1200 Glu Trp IleAla Pro Ile Gly Ile Ala Gly Ala Asp Lys Asn Tyr Asn 1205 1210 1215 LeuAla Phe Gly Phe Pro Pro Gln Ala Arg Tyr Asp Leu Val Phe Ile 1220 12251230 Asn Ile Gly Thr Lys Tyr Arg Asn His His Phe Gln Gln Cys Glu Asp1235 1240 1245 His Ala Ala Thr Leu Lys Thr Leu Ser Arg Ser Ala Leu AsnCys Leu 1250 1255 1260 Asn Pro Gly Gly Thr Leu Val Val Lys Ser Tyr GlyTyr Ala Asp Arg 1265 1270 1275 1280 Asn Ser Glu Asp Val Val Thr Ala LeuAla Arg Lys Phe Val Arg Val 1285 1290 1295 Ser Ala Ala Arg Pro Glu CysVal Ser Ser Asn Thr Glu Met Tyr Leu 1300 1305 1310 Ile Phe Arg Gln LeuAsp Asn Ser Arg Thr Arg Gln Phe Thr Pro His 1315 1320 1325 His Leu AsnCys Val Ile Ser Ser Val Tyr Glu Gly Thr Arg Asp Gly 1330 1335 1340 ValGly Ala Ala Pro Ser Tyr Arg Thr Lys Arg Glu Asn Ile Ala Asp 1345 13501355 1360 Cys Gln Glu Glu Ala Val Val Asn Ala Ala Asn Pro Leu Gly ArgPro 1365 1370 1375 Gly Glu Gly Val Cys Arg Ala Ile Tyr Lys Arg Trp ProAsn Ser Phe 1380 1385 1390 Thr Asp Ser Ala Thr Glu Thr Gly Thr Ala LysLeu Thr Val Cys Gln 1395 1400 1405 Gly Lys Lys Val Ile His Ala Val GlyPro Asp Phe Arg Lys His Pro 1410 1415 1420 Glu Ala Glu Ala Leu Lys LeuLeu Gln Asn Ala Tyr His Ala Val Ala 1425 1430 1435 1440 Asp Leu Val AsnGlu His Asn Ile Lys Ser Val Ala Ile Pro Leu Leu 1445 1450 1455 Ser ThrGly Ile Tyr Ala Ala Gly Lys Asp Arg Leu Glu Val Ser Leu 1460 1465 1470Asn Cys Leu Thr Thr Ala Leu Asp Arg Thr Asp Ala Asp Val Thr Ile 14751480 1485 Tyr Cys Leu Asp Lys Lys Trp Lys Glu Arg Ile Asp Ala Val LeuGln 1490 1495 1500 Leu Lys Glu Ser Val Ile Glu Leu Lys Asp Glu Asp MetGlu Ile Asp 1505 1510 1515 1520 Asp Glu Leu Val Trp Ile His Pro Asp SerCys Leu Lys Gly Arg Lys 1525 1530 1535 Gly Phe Ser Thr Thr Lys Gly LysLeu Tyr Ser Tyr Phe Glu Gly Thr 1540 1545 1550 Lys Phe His Gln Ala AlaLys Asp Met Ala Glu Ile Lys Val Leu Phe 1555 1560 1565 Pro Asn Asp GlnGlu Ser Asn Glu Gln Leu Cys Ala Tyr Ile Leu Gly 1570 1575 1580 Glu ThrMet Glu Ala Ile Arg Glu Lys Cys Pro Val Asp His Asn Pro 1585 1590 15951600 Ser Ser Ser Pro Pro Lys Thr Leu Pro Cys Leu Cys Met Tyr Ala Met1605 1610 1615 Thr Pro Glu Arg Val His Arg Leu Arg Ser Asn Asn Val LysGlu Val 1620 1625 1630 Thr Val Cys Ser Ser Thr Pro Leu Pro Lys Tyr LysIle Lys Asn Val 1635 1640 1645 Gln Lys Val Gln Cys Thr Lys Val Val LeuPhe Asn Pro His Thr Pro 1650 1655 1660 Ala Phe Val Pro Ala Arg Lys TyrIle Glu Ala Pro Glu Gln Pro Ala 1665 1670 1675 1680 Ala Pro Pro Ala GlnAla Glu Glu Ala Pro Glu Val Ala Ala Thr Pro 1685 1690 1695 Thr Pro ProAla Ala Asp Asn Thr Ser Leu Asp Val Thr Asp Ile Ser 1700 1705 1710 LeuAsp Met Glu Asp Ser Ser Glu Gly Ser Leu Phe Ser Ser Phe Ser 1715 17201725 Gly Ser Asp Asn Ser Ile Thr Ser Met Asp Ser Trp Ser Ser Gly Pro1730 1735 1740 Ser Ser Leu Glu Ile Val Asp Arg Arg Gln Val Val Val AlaAsp Val 1745 1750 1755 1760 His Ala Val Gln Glu Pro Ala Pro Val Pro ProPro Arg Leu Lys Lys 1765 1770 1775 Met Ala Arg Leu Ala Ala Ala Arg MetGln Glu Glu Pro Thr Pro Pro 1780 1785 1790 Ala Ser Thr Ser Ser Ala AspGlu Ser Leu His Leu Ser Phe Gly Gly 1795 1800 1805 Val Ser Met Ser PheGly Ser Leu Phe Asp Gly Glu Met Gly Ala Leu 1810 1815 1820 Ala Ala AlaGln Pro Pro Ala Ser Thr Cys Pro Thr Asp Val Pro Met 1825 1830 1835 1840Ser Phe Gly Ser Phe Ser Asp Gly Glu Ile Glu Glu Leu Ser Arg Arg 18451850 1855 Val Thr Glu Ser Glu Pro Val Leu Phe Gly Ser Phe Glu Pro GlyGlu 1860 1865 1870 Val Asn Ser Ile Ile Ser Ser Arg Ser Val Val Ser PhePro Pro Arg 1875 1880 1885 Lys Gln Arg Arg Arg Arg Arg Ser Arg Arg ThrGlu Tyr Leu Thr Gly 1890 1895 1900 Val Gly Gly Tyr Ile Phe Ser Thr AspThr Gly Pro Gly His Leu Gln 1905 1910 1915 1920 Met Glu Ser Val Leu GlnAsn Gln Leu Thr Glu Pro Thr Leu Glu Arg 1925 1930 1935 Asn Val Leu GluArg Ile Tyr Ala Pro Val Leu Asp Thr Ser Lys Glu 1940 1945 1950 Glu GlnLeu Lys Leu Arg Tyr Gln Met Met Pro Thr Glu Ala Asn Lys 1955 1960 1965Ser Arg Tyr Gln Ser Arg Lys Val Glu Asn Gln Lys Ala Ile Thr Thr 19701975 1980 Glu Arg Leu Leu Ser Gly Leu Arg Leu Tyr Asn Ser Ala Thr AspGln 1985 1990 1995 2000 Pro Glu Cys Tyr Lys Ile Thr Tyr Pro Lys Pro SerTyr Ser Ser Ser 2005 2010 2015 Val Pro Ala Asn Tyr Ser Asp Pro Lys PheAla Val Ala Val Cys Asn 2020 2025 2030 Asn Tyr Leu His Glu Asn Tyr ProThr Val Ala Ser Tyr Gln Ile Thr 2035 2040 2045 Asp Glu Tyr Asp Ala TyrLeu Asp Met Val Asp Gly Thr Val Ala Cys 2050 2055 2060 Leu Asp Thr AlaThr Phe Cys Pro Ala Lys Leu Arg Ser Tyr Pro Lys 2065 2070 2075 2080 ArgHis Glu Tyr Arg Ala Pro Asn Thr Arg Ser Ala Val Pro Ser Ala 2085 20902095 Met Gln Asn Thr Leu Gln Asn Val Leu Ile Ala Ala Thr Lys Arg Asn2100 2105 2110 Cys Asn Val Thr Gln Met Arg Glu Leu Pro Thr Leu Asp SerAla Thr 2115 2120 2125 Phe Asn Val Glu Cys Phe Arg Lys Tyr Ala Cys AsnAsp Glu Tyr Trp 2130 2135 2140 Glu Glu Phe Ala Arg Lys Pro Ile Arg IleThr Thr Glu Phe Val Thr 2145 2150 2155 2160 Ala Tyr Val Ala Arg Leu LysGly Pro Lys Ala Ala Ala Leu Phe Ala 2165 2170 2175 Lys Thr His Asn LeuVal Pro Leu Gln Glu Val Pro Met Asp Arg Phe 2180 2185 2190 Val Met AspMet Lys Arg Asp Val Lys Val Thr Pro Gly Thr Lys His 2195 2200 2205 ThrGlu Glu Arg Pro Lys Val Gln Val Leu Gln Ala Ala Glu Pro Leu 2210 22152220 Ala Thr Ala Tyr Leu Cys Gly Ile His Arg Glu Leu Val Arg Arg Leu2225 2230 2235 2240 Thr Ala Val Leu Leu Pro Asn Ile His Thr Leu Phe AspMet Ser Ala 2245 2250 2255 Glu Asp Phe Asp Ala Ile Ile Ala Glu His PheLys Gln Gly Asp Pro 2260 2265 2270 Val Leu Glu Thr Asp Ile Ala Ser PheAsp Lys Ser Gln Asp Asp Ala 2275 2280 2285 Met Ala Leu Thr Gly Leu MetIle Leu Glu Asp Leu Gly Val Asp Gln 2290 2295 2300 Pro Leu Leu Asp LeuIle Glu Cys Ala Phe Gly Glu Ile Ser Ser Thr 2305 2310 2315 2320 His LeuPro Thr Gly Thr Arg Phe Lys Phe Gly Ala Met Met Lys Ser 2325 2330 2335Gly Met Phe Leu Thr Leu Phe Val Asn Thr Val Leu Asn Val Val Ile 23402345 2350 Ala Ser Arg Val Leu Glu Glu Arg Leu Lys Thr Ser Lys Cys AlaAla 2355 2360 2365 Phe Ile Gly Asp Asp Asn Ile Ile His Gly Val Val SerAsp Lys Glu 2370 2375 2380 Met Ala Glu Arg Cys Ala Thr Trp Leu Asn MetGlu Val Lys Ile Ile 2385 2390 2395 2400 Asp Ala Val Ile Gly Glu Arg ProPro Tyr Phe Cys Gly Gly Phe Ile 2405 2410 2415 Leu Gln Asp Ser Val ThrSer Thr Ala Cys Arg Val Ala Asp Pro Leu 2420 2425 2430 Lys Arg Leu PheLys Leu Gly Lys Pro Leu Pro Ala Asp Asp Glu Gln 2435 2440 2445 Asp GluAsp Arg Arg Arg Ala Leu Leu Asp Glu Thr Lys Ala Trp Phe 2450 2455 2460Arg Val Gly Ile Thr Asp Thr Leu Ala Val Ala Val Ala Thr Arg Tyr 24652470 2475 2480 Glu Val Asp Asn Ile Thr Pro Val Leu Leu Ala Leu Arg ThrPhe Ala 2485 2490 2495 Gln Ser Lys Arg Ala Phe Gln Ala Ile Arg Gly GluIle Lys His Leu 2500 2505 2510 Tyr Gly Gly Pro Lys 2515 1245 amino acidsamino acid linear protein 6 Met Asn Arg Gly Phe Phe Asn Met Leu Gly ArgArg Pro Phe Pro Ala 1 5 10 15 Pro Thr Ala Met Trp Arg Pro Arg Arg ArgArg Gln Ala Ala Pro Met 20 25 30 Pro Ala Arg Asn Gly Leu Ala Ser Gln IleGln Gln Leu Thr Thr Ala 35 40 45 Val Ser Ala Leu Val Ile Gly Gln Ala ThrArg Pro Gln Thr Pro Arg 50 55 60 Pro Arg Pro Pro Pro Arg Gln Lys Lys GlnAla Pro Lys Gln Pro Pro 65 70 75 80 Lys Pro Lys Lys Pro Lys Thr Gln GluLys Lys Lys Lys Gln Pro Ala 85 90 95 Lys Pro Lys Pro Gly Lys Arg Gln ArgMet Ala Leu Lys Leu Glu Ala 100 105 110 Asp Arg Leu Phe Asp Val Lys AsnGlu Asp Gly Asp Val Ile Gly His 115 120 125 Ala Leu Ala Met Glu Gly LysVal Met Lys Pro Leu His Val Lys Gly 130 135 140 Thr Ile Asp His Pro ValLeu Ser Lys Leu Lys Phe Thr Lys Ser Ser 145 150 155 160 Ala Tyr Asp MetGlu Phe Ala Gln Leu Pro Val Asn Met Arg Ser Glu 165 170 175 Ala Phe ThrTyr Thr Ser Glu His Pro Glu Gly Phe Tyr Asn Trp His 180 185 190 His GlyAla Val Gln Tyr Ser Gly Gly Arg Phe Thr Ile Pro Arg Gly 195 200 205 ValGly Gly Arg Gly Asp Ser Gly Arg Pro Ile Met Asp Asn Ser Gly 210 215 220Arg Val Val Ala Ile Val Leu Gly Gly Ala Asp Glu Gly Thr Arg Thr 225 230235 240 Ala Leu Ser Val Val Thr Trp Asn Ser Lys Gly Lys Thr Ile Lys Thr245 250 255 Thr Pro Glu Gly Thr Glu Glu Trp Ser Ala Ala Pro Leu Val ThrAla 260 265 270 Met Cys Leu Leu Gly Asn Val Ser Phe Pro Cys Asn Arg ProPro Thr 275 280 285 Cys Tyr Thr Arg Glu Pro Ser Arg Ala Leu Asp Ile LeuGlu Glu Asn 290 295 300 Val Asn His Glu Ala Tyr Asp Thr Leu Leu Asn AlaIle Leu Arg Cys 305 310 315 320 Gly Ser Ser Gly Arg Ser Lys Arg Ser ValThr Asp Asp Phe Thr Leu 325 330 335 Thr Ser Pro Tyr Leu Gly Thr Cys SerTyr Cys His His Thr Glu Pro 340 345 350 Cys Phe Ser Pro Ile Lys Ile GluGln Val Trp Asp Glu Ala Asp Asp 355 360 365 Asn Thr Ile Arg Ile Gln ThrSer Ala Gln Phe Gly Tyr Asp Gln Ser 370 375 380 Gly Ala Ala Ser Ser AsnLys Tyr Arg Tyr Met Ser Leu Glu Gln Asp 385 390 395 400 His Thr Val LysGlu Gly Thr Met Asp Asp Ile Lys Ile Ser Thr Ser 405 410 415 Gly Pro CysArg Arg Leu Ser Tyr Lys Gly Tyr Phe Leu Leu Ala Lys 420 425 430 Cys ProPro Gly Asp Ser Val Thr Val Ser Ile Ala Ser Ser Asn Ser 435 440 445 AlaThr Ser Cys Thr Met Ala Arg Lys Ile Lys Pro Lys Phe Val Gly 450 455 460Arg Glu Lys Tyr Asp Leu Pro Pro Val His Gly Lys Lys Ile Pro Cys 465 470475 480 Thr Val Tyr Asp Arg Leu Lys Glu Thr Thr Ala Gly Tyr Ile Thr Met485 490 495 His Arg Pro Gly Pro His Ala Tyr Thr Ser Tyr Leu Glu Glu SerSer 500 505 510 Gly Lys Val Tyr Ala Lys Pro Pro Ser Gly Lys Asn Ile ThrTyr Glu 515 520 525 Cys Lys Cys Gly Asp Tyr Lys Thr Gly Thr Val Thr ThrArg Thr Glu 530 535 540 Ile Thr Gly Cys Thr Ala Ile Lys Gln Cys Val AlaTyr Lys Ser Asp 545 550 555 560 Gln Thr Lys Trp Val Phe Asn Ser Pro AspLeu Ile Arg His Ala Asp 565 570 575 His Thr Ala Gln Gly Lys Leu His LeuPro Phe Lys Leu Ile Pro Ser 580 585 590 Thr Cys Met Val Pro Val Ala HisAla Pro Asn Val Val His Gly Phe 595 600 605 Lys His Ile Ser Leu Gln LeuAsp Thr Asp His Leu Thr Leu Leu Thr 610 615 620 Thr Arg Arg Leu Gly AlaAsn Pro Glu Pro Thr Thr Glu Trp Ile Ile 625 630 635 640 Gly Lys Thr ValArg Asn Phe Thr Val Asp Arg Asp Gly Leu Glu Tyr 645 650 655 Ile Trp GlyAsn His Glu Pro Val Arg Val Tyr Ala Gln Glu Ser Ala 660 665 670 Pro GlyAsp Pro His Gly Trp Pro His Glu Ile Val Gln His Tyr Tyr 675 680 685 HisArg His Pro Val Tyr Thr Ile Leu Ala Val Ala Ser Ala Ala Val 690 695 700Ala Met Met Ile Gly Val Thr Val Ala Ala Leu Cys Ala Cys Lys Ala 705 710715 720 Arg Arg Glu Cys Leu Thr Pro Tyr Ala Leu Ala Pro Asn Ala Val Ile725 730 735 Pro Thr Ser Leu Ala Leu Leu Cys Cys Val Arg Ser Ala Asn AlaGlu 740 745 750 Thr Phe Thr Glu Thr Met Ser Tyr Leu Trp Ser Asn Ser GlnPro Phe 755 760 765 Phe Trp Val Gln Leu Cys Ile Pro Leu Ala Ala Val IleVal Leu Met 770 775 780 Arg Cys Cys Ser Cys Cys Leu Pro Phe Leu Val ValAla Gly Ala Tyr 785 790 795 800 Leu Ala Lys Val Asp Ala Tyr Glu His AlaThr Thr Val Pro Asn Val 805 810 815 Pro Gln Ile Pro Tyr Lys Ala Leu ValGlu Arg Ala Gly Tyr Ala Pro 820 825 830 Leu Asn Leu Glu Ile Thr Val MetSer Ser Glu Val Leu Pro Ser Thr 835 840 845 Asn Gln Glu Tyr Ile Thr CysLys Phe Thr Thr Val Val Pro Ser Pro 850 855 860 Lys Val Lys Cys Cys GlySer Leu Glu Cys Gln Pro Ala Ala His Ala 865 870 875 880 Asp Tyr Thr CysLys Val Phe Gly Gly Val Tyr Pro Phe Met Trp Gly 885 890 895 Gly Ala GlnCys Phe Cys Asp Ser Glu Asn Ser Gln Met Ser Glu Ala 900 905 910 Tyr ValGlu Leu Ser Ala Asp Cys Ala Thr Asp His Ala Gln Ala Ile 915 920 925 LysVal His Thr Ala Ala Met Lys Val Gly Leu Arg Ile Val Tyr Gly 930 935 940Asn Thr Thr Ser Phe Leu Asp Val Tyr Val Asn Gly Val Thr Pro Gly 945 950955 960 Thr Ser Lys Asp Leu Lys Val Ile Ala Gly Pro Ile Ser Ala Ser Phe965 970 975 Thr Pro Phe Asp His Lys Val Val Ile His Arg Gly Leu Val TyrAsn 980 985 990 Tyr Asp Phe Pro Glu Tyr Gly Ala Met Lys Pro Gly Ala PheGly Asp 995 1000 1005 Ile Gln Ala Thr Ser Leu Thr Ser Lys Asp Leu IleAla Ser Thr Asp 1010 1015 1020 Ile Arg Leu Leu Lys Pro Ser Ala Lys AsnVal His Val Pro Tyr Thr 1025 1030 1035 1040 Gln Ala Ala Ser Gly Phe GluMet Trp Lys Asn Asn Ser Gly Arg Pro 1045 1050 1055 Leu Gln Glu Thr AlaPro Phe Gly Cys Lys Ile Ala Val Asn Pro Leu 1060 1065 1070 Arg Ala ValAsp Cys Ser Tyr Gly Asn Ile Pro Ile Ser Ile Asp Ile 1075 1080 1085 ProAsn Ala Ala Phe Ile Arg Thr Ser Asp Ala Pro Leu Val Ser Thr 1090 10951100 Val Lys Cys Asp Val Ser Glu Cys Thr Tyr Ser Ala Asp Phe Gly Gly1105 1110 1115 1120 Met Ala Thr Leu Gln Tyr Val Ser Asp Arg Glu Gly GlnCys Pro Val 1125 1130 1135 His Ser His Ser Ser Thr Ala Thr Leu Gln GluSer Thr Val His Val 1140 1145 1150 Leu Glu Lys Gly Ala Val Thr Val HisPhe Ser Thr Ala Ser Pro Gln 1155 1160 1165 Ala Asn Phe Ile Val Ser LeuCys Gly Lys Lys Thr Thr Cys Asn Ala 1170 1175 1180 Glu Cys Lys Pro ProAla Asp His Ile Val Ser Thr Pro His Lys Asn 1185 1190 1195 1200 Asp GlnGlu Phe Gln Ala Ala Ile Ser Lys Thr Ser Trp Ser Trp Leu 1205 1210 1215Phe Ala Leu Phe Gly Gly Ala Ser Ser Leu Leu Ile Ile Gly Leu Met 12201225 1230 Ile Phe Ala Cys Ser Met Met Leu Thr Ser Thr Arg Arg 1235 12401245 11663 base pairs nucleic acid double linear cDNA 7 ATTGGCGGCGTAGTACACAC TATTGAATCA AACAGCCGAC CAATTGCACT ACCATCACAA 60 TGGAGAAGCCAGTAGTTAAC GTAGACGTAG ACCCTCAGAG TCCGTTTGTC GTGCAACTGC 120 AAAAGAGCTTCCCGCAATTT GAGGTAGTAG CACAGCAGGT CACTCCAAAT GACCATGCTA 180 ATGCCAGAGCATTTTCGCAT CTGGCCAGTA AACTGATCGA GCTGGAGGTT CCTACCACAG 240 CGACGATTTTGGACATAGGC AGCGCACCGG CTCGTAGAAT GTTTTCCGAG CACCAGTACC 300 ATTGCGTTTGCCCCATGCGT AGTCCAGAAG ACCCGGACCG CATGATGAAA TATGCCAGCA 360 AACTGGCGGAAAAAGCATGT AAGATTACAA ACAAGAACTT GCATGAGAAG ATCAAGGACC 420 TCCGGACCGTACTTGATACA CCGGATGCTG AAACGCCATC ACTCTGCTTC CACAACGATG 480 TTACCTGCAACACGCGTGCC GAGTACTCCG TCATGCAGGA CGTGTACATC AACGCTCCCG 540 GAACTATTTACCACCAGGCT ATGAAAGGCG TGCGGACCCT GTACTGGATT GGCTTCGACA 600 CCACCCAGTTCATGTTCTCG GCTATGGCAG GTTCGTACCC TGCATACAAC ACCAACTGGG 660 CCGACGAAAAAGTCCTTGAA GCGCGTAACA TCGGACTCTG CAGCACAAAG CTGAGTGAAG 720 GCAGGACAGGAAAGTTGTCG ATAATGAGGA AGAAGGAGTT GAAGCCCGGG TCACGGGTTT 780 ATTTCTCCGTTGGATCGACA CTTTACCCAG AACACAGAGC CAGCTTGCAG AGCTGGCATC 840 TTCCATCGGTGTTCCACTTG AAAGGAAAGC AGTCGTACAC TTGCCGCTGT GATACAGTGG 900 TGAGCTGCGAAGGCTACGTA GTGAAGAAAA TCACCATCAG TCCCGGGATC ACGGGAGAAA 960 CCGTGGGATACGCGGTTACA AACAATAGCG AGGGCTTCTT GCTATGCAAA GTTACCGATA 1020 CAGTAAAAGGAGAACGGGTA TCGTTCCCCG TGTGCACGTA TATCCCGGCC ACCATATGCG 1080 ATCAGATGACCGGCATAATG GCCACGGATA TCTCACCTGA CGATGCACAA AAACTTCTGG 1140 TTGGGCTCAACCAGCGAATC GTCATTAACG GTAAGACTAA CAGGAACACC AATACCATGC 1200 AAAATTACCTTCTGCCAATC ATTGCACAAG GGTTCAGCAA ATGGGCCAAG GAGCGCAAAG 1260 AAGATCTTGACAATGAAAAA ATGCTGGGCA CCAGAGAGCG CAAGCTTACA TATGGCTGCT 1320 TGTGGGCGTTTCGCACTAAG AAAGTGCACT CGTTCTATCG CCCACCTGGA ACGCAGACCA 1380 TCGTAAAAGTCCCAGCCTCT TTTAGCGCTT TCCCCATGTC ATCCGTATGG ACTACCTCTT 1440 TGCCCATGTCGCTGAGGCAG AAGATGAAAT TGGCATTACA ACCAAAGAAG GAGGAAAAAC 1500 TGCTGCAAGTCCCGGAGGAA TTAGTTATGG AGGCCAAGGC TGCTTTCGAG GATGCTCAGG 1560 AGGAATCCAGAGCGGAGAAG CTCCGAGAAG CACTCCCACC ATTAGTGGCA GACAAAGGTA 1620 TCGAGGCAGCTGCGGAAGTT GTCTGCGAAG TGGAGGGGCT CCAGGCGGAC ACCGGAGCAG 1680 CACTCGTCGAAACCCCGCGC GGTCATGTAA GGATAATACC TCAAGCAAAT GACCGTATGA 1740 TCGGACAGTATATCGTTGTC TCGCCGATCT CTGTGCTGAA GAACGCTAAA CTCGCACCAG 1800 CACACCCGCTAGCAGACCAG GTTAAGATCA TAACGCACTC CGGAAGATCA GGAAGGTATG 1860 CAGTCGAACCATACGACGCT AAAGTACTGA TGCCAGCAGG AAGTGCCGTA CCATGGCCAG 1920 AATTCTTAGCACTGAGTGAG AGCGCCACGC TTGTGTACAA CGAAAGAGAG TTTGTGAACC 1980 GCAAGCTGTACCATATTGCC ATGCACGGTC CCGCTAAGAA TACAGAAGAG GAGCAGTACA 2040 AGGTTACAAAGGCAGAGCTC GCAGAAACAG AGTACGTGTT TGACGTGGAC AAGAAGCGAT 2100 GCGTTAAGAAGGAAGAAGCC TCAGGACTTG TCCTTTCGGG AGAACTGACC AACCCGCCCT 2160 ATCACGAACTAGCTCTTGAG GGACTGAAGA CTCGACCCGC GGTCCCGTAC AAGGTTGAAA 2220 CAATAGGAGTGATAGGCACA CCAGGATCGG GCAAGTCAGC TATCATCAAG TCAACTGTCA 2280 CGGCACGTGATCTTGTTACC AGCGGAAAGA AAGAAAACTG CCGCGAAATT GAGGCCGACG 2340 TGCTACGGCTGAGGGGCATG CAGATCACGT CGAAGACAGT GGATTCGGTT ATGCTCAACG 2400 GATGCCACAAAGCCGTAGAA GTGCTGTATG TTGACGAAGC GTTCCGGTGC CACGCAGGAG 2460 CACTACTTGCCTTGATTGCA ATCGTCAGAC CCCGTAAGAA GGTAGTACTA TGCGGAGACC 2520 CTAAGCAATGCGGATTCTTC AACATGATGC AACTAAAGGT ACATTTCAAC CACCCTGAAA 2580 AAGACATATGTACCAAGACA TTCTACAAGT TTATCTCCCG ACGTTGCACA CAGCCAGTCA 2640 CGGCTATTGTATCGACACTG CATTACGATG GAAAAATGAA AACCACAAAC CCGTGCAAGA 2700 AGAACATCGAAATCGACATT ACAGGGGCCA CGAAGCCGAA GCCAGGGGAC ATCATCCTGA 2760 CATGTTTCCGCGGGTGGGTT AAGCAACTGC AAATCGACTA TCCCGGACAT GAGGTAATGA 2820 CAGCCGCGGCCTCACAAGGG CTAACCAGAA AAGGAGTATA TGCCGTCCGG CAAAAAGTCA 2880 ATGAAAACCCGCTGTACGCG ATCACATCAG AGCATGTGAA CGTGTTGCTC ACCCGCACTG 2940 AGGACAGGCTAGTATGGAAA ACTTTACAGG GCGACCCATG GATTAAGCAG CTCACTAACG 3000 TACCTAAAGGAAATTTTCAG GCCACCATCG AGGACTGGGA AGCTGAACAC AAGGGAATAA 3060 TTGCTGCGATAAACAGTCCC GCTCCCCGTA CCAATCCGTT CAGCTGCAAG ACTAACGTTT 3120 GCTGGGCGAAAGCACTGGAA CCGATACTGG CCACGGCCGG TATCGTACTT ACCGGTTGCC 3180 AGTGGAGCGAGCTGTTCCCA CAGTTTGCGG ATGACAAACC ACACTCGGCC ATCTACGCCT 3240 TAGACGTAATTTGCATTAAG TTTTTCGGCA TGGACTTGAC AAGCGGGCTG TTTTCCAAAC 3300 AGAGCATCCCGTTAACGTAC CATCCTGCCG ACTCAGCGAG GCCAGTAGCT CATTGGGACA 3360 ACAGCCCAGGAACACGCAAG TATGGGTACG ATCACGCCGT TGCCGCCGAA CTCTCCCGTA 3420 GATTTCCGGTGTTCCAGCTA GCTGGGAAAG GCACACAGCT TGATTTGCAG ACGGGCAGAA 3480 CTAGAGTTATCTCTGCACAG CATAACTTGG TCCCAGTGAA CCGCAATCTC CCTCACGCCT 3540 TAGTCCCCGAGCACAAGGAG AAACAACCCG GCCCGGTCGA AAAATTCTTG AGCCAGTTCA 3600 AACACCACTCCGTACTTGTG ATCTCAGAGA AAAAAATTGA AGCTCCCCAC AAGAGAATCG 3660 AATGGATCGCCCCGATTGGC ATAGCCGGCG CAGATAAGAA CTACAACCTG GCTTTCGGGT 3720 TTCCGCCGCAGGCACGGTAC GACCTGGTGT TCATCAATAT TGGAACTAAA TACAGAAACC 3780 ATCACTTTCAACAGTGCGAA GACCACGCGG CGACCTTGAA AACCCTTTCG CGTTCGGCCC 3840 TGAACTGCCTTAACCCCGGA GGGACCCTCG TGGTGAAGTC CTACGGTTAC GCCGACCGCA 3900 ATAGTGAGGACGTAGTCACC GCTCTTGCCA GAAAATTTGT CAGAGTGTCT GCAGCGAGGC 3960 CAGAGTGCGTCTCAAGCAAT ACAGAAATGT ACCTGATTTT CCGACAACTA GACAACAGCC 4020 GCACACGACAATTCACCCCG CATCATTTGA ATTGTGTGAT TTCGTCCGTG TACGAGGGTA 4080 CAAGAGACGGAGTTGGAGCC GCACCGTCGT ACCGTACTAA AAGGGAGAAC ATTGCTGATT 4140 GTCAAGAGGAAGCAGTTGTC AATGCAGCCA ATCCACTGGG CAGACCAGGA GAAGGAGTCT 4200 GCCGTGCCATCTATAAACGT TGGCCGAACA GTTTCACCGA TTCAGCCACA GAGACAGGTA 4260 CCGCAAAACTGACTGTGTGC CAAGGAAAGA AAGTGATCCA CGCGGTTGGC CCTGATTTCC 4320 GGAAACACCCAGAGGCAGAA GCCCTGAAAT TGCTGCAAAA CGCCTACCAT GCAGTGGCAG 4380 ACTTAGTAAATGAACATAAT ATCAAGTCTG TCGCCATCCC ACTGCTATCT ACAGGCATTT 4440 ACGCAGCCGGAAAAGACCGC CTTGAGGTAT CACTTAACTG CTTGACAACC GCGCTAGACA 4500 GAACTGATGCGGACGTAACC ATCTACTGCC TGGATAAGAA GTGGAAGGAA AGAATCGACG 4560 CGGTGCTCCAACTTAAGGAG TCTGTAACTG AGCTGAAGGA TGAGGATATG GAGATCGACG 4620 ACGAGTTAGTATGGATCCAT CCGGACAGTT GCCTGAAGGG AAGAAAGGGA TTCAGTACTA 4680 CAAAAGGAAAGTTGTATTCG TACTTTGAAG GCACCAAATT CCATCAAGCA GCAAAAGATA 4740 TGGCGGAGATAAAGGTCCTG TTCCCAAATG ACCAGGAAAG CAACGAACAA CTGTGTGCCT 4800 ACATATTGGGGGAGACCATG GAAGCAATCC GCGAAAAATG CCCGGTCGAC CACAACCCGT 4860 CGTCTAGCCCGCCAAAAACG CTGCCGTGCC TCTGTATGTA TGCCATGACG CCAGAAAGGG 4920 TCCACAGACTCAGAAGCAAT AACGTCAAAG AAGTTACAGT ATGCTCCTCC ACCCCCCTTC 4980 CAAAGTACAAAATCAAGAAT GTTCAGAAGG TTCAGTGCAC AAAAGTAGTC CTGTTTAACC 5040 CGCATACCCCCGCATTCGTT CCCGCCCGTA AGTACATAGA AGCACCAGAA CAGCCTGCAG 5100 CTCCGCCTGCACAGGCCGAG GAGGCCCCCG GAGTTGTAGC GACACCAACA CCACCTGCAG 5160 CTGATAACACCTCGCTTGAT GTCACGGACA TCTCACTGGA CATGGAAGAC AGTAGCGAAG 5220 GCTCACTCTTTTCGAGCTTT AGCGGATCGG ACAACTACCG AAGGCAGGTG GTGGTGGCTG 5280 ACGTCCATGCCGTCCAAGAG CCTGCCCCTG TTCCACCGCC AAGGCTAAAG AAGATGGCCC 5340 GCCTGGCAGCGGCAAGAATG CAGGAAGAGC CAACTCCACC GGCAAGCACC AGCTCTGCGG 5400 ACGAGTCCCTTCACCTTTCT TTTGATGGGG TATCTATATC CTTCGGATCC CTTTTCGACG 5460 GAGAGATGGCCCGCTTGGCA GCGGCACAAC CCCCGGCAAG TACATGCCCT ACGGATGTGC 5520 CTATGTCTTTCGGATCGTTT TCCGACGGAG AGATTGAGGA GTTGAGCCGC AGAGTAACCG 5580 AGTCGGAGCCCGTCCTGTTT GGGTCATTTG AACCGGGCGA AGTGAACTCA ATTATATCGT 5640 CCCGATCAGCCGTATCTTTT CCACCACGCA AGCAGAGACG TAGACGCAGG AGCAGGAGGA 5700 CCGAATACTGTCTAACCGGG GTAGGTGGGT ACATATTTTC GACGGACACA GGCCCTGGGC 5760 ACTTGCAAAAGAAGTCCGTT CTGCAGAACC AGCTTACAGA ACCGACCTTG GAGCGCAATG 5820 TTCTGGAAAGAATCTACGCC CCGGTGCTCG ACACGTCGAA AGAGGAACAG CTCAAACTCA 5880 GGTACCAGATGATGCCCACC GAAGCCAACA AAAGCAGGTA CCAGTCTCGA AAAGTAGAAA 5940 ACCAGAAAGCCATAACCACT GAGCGACTGC TTTCAGGGCT ACGGCTGTAT AACTCTGCCA 6000 CAGATCAGCCAGAATGCTAT AAGATCACCT ACCCGAAACC ATCGTATTCC AGCAGTGTAC 6060 CAGCGAACTACTCTGACCCA AAGTTTGCTG TAGCTGTTTG TAACAACTAT CTGCATGAGA 6120 ATTACCCGACGGTAGCATCT TATCAGATCA CCGACGAGTA CGATGCTTAC TTGGATATGG 6180 TAGACGGGACAGTCGCTTGC CTAGATACTG CAACTTTTTG CCCCGCCAAG CTTAGAAGTT 6240 ACCCGAAAAGACACGAGTAT AGAGCCCCAA ACATCCGCAG TGCGGTTCCA TCAGCGATGC 6300 AGAACACGTTGCAAAACGTG CTCATTGCCG CGACTAAAAG AAACTGCAAC GTCACACAAA 6360 TGCGTGAACTGCCAACACTG GACTCAGCGA CATTCAACGT TGAATGCTTT CGAAAATATG 6420 CATGCAATGACGAGTATTGG GAGGAGTTTG CCCGAAAGCC AATTAGGATC ACTACTGAGT 6480 TCGTTACCGCATACGTGGCC AGACTGAAAG GCCCTAAGGC CGCCGCACTG TTCGCAAAGA 6540 CGCATAATTTGGTCCCATTG CAAGAAGTGC CTATGGATAG ATTCGTCATG GACATGAAAA 6600 GAGACGTGAAAGTTACACCT GGCACGAAAC ACACAGAAGA AAGACCGAAA GTACAAGTGA 6660 TACAAGCCGCAGAACCCCTG GCGACCGCTT ACCTATGCGG GATCCACCGG GAGTTAGTGC 6720 GCAGGCTTACAGCCGTTTTG CTACCCAACA TTCACACGCT CTTTGACATG TCGGCGGAGG 6780 ACTTTGATGCAATCATAGCA GAACACTTCA AGCAAGGTGA CCCGGTACTG GAGACGGATA 6840 TCGCCTCGTTCGACAAAAGC CAAGACGACG CTATGGCGTT AACCGGCCTG ATGATCTTGG 6900 AAGACCTGGGTGTGGACCAA CCACTACTCG ACTTGATCGA GTGCGCCTTT GGAGAAATAT 6960 CATCCACCCATCTGCCCACG GGTACCCGTT TCAAATTCGG GGCGATGATG AAATCCGGAA 7020 TGTTCCTCACGCTCTTTGTC AACACAGTTC TGAATGTCGT TATCGCCAGC AGAGTATTGG 7080 AGGAGCGGCTTAAAACGTCC AAATGTGCAG CATTTATCGG CGACGACAAC ATTATACACG 7140 GAGTAGTATCTGACAAAGAA ATGGCTGAGA GGTGTGCCAC CTGGCTCAAC ATGGAGGTTA 7200 AGATCATTGACGCAGTCATC GGCGAGAGAC CACCTTACTT CTGCGGTGGA TTCATCTTGC 7260 AAGATTCGGTTACCTCCACA GCGTGTCGCG TGGCGGACCC CTTGAAAAGG CTGTTTAAGT 7320 TGGGTAAACCGCTCCCAGCC GACGATGAGC AAGACGAAGA CAGAAGACGC GCTCTGCTAG 7380 ATGAAACAAAGGCGTGGTTT AGAGTAGGTA TAACAGACAC CTTAGCAGTG GCCGTGGCAA 7440 CTCGGTATGAGGTAGACAAC ATCACACCTG TCCTGCTGGC ATTGAGAACT TTTGCCCAGA 7500 GCAAAAGAGCATTTCAAGCC ATCAGAGGGG AAATAAAGCA TCTCTACGGT GGTCCTAAAT 7560 AGTCAGCATAGTACATTTCA TCTGACTAAT ACCACAACAC CACCACCATG AATAGAGGAT 7620 TCTTTAACATGCTCGGCCGC CGCCCCTTCC CAGCCCCCAC TGCCATGTGG AGGCCGCGGA 7680 GAAGGAGGCAGGCGGCCCCG ATGCCTGCCC GCAATGGGCT GGCTTCCCAA ATCCAGCAAC 7740 TGACCACAGCCGTCAGTGCC CTAGTCATTG GACAGGCAAC TAGACCTCAA ACCCCACGCC 7800 CACGCCCGCCGCCGCGCCAG AAGAAGCAGG CGCCAAAGCA ACCACCGAAG CCGAAGAAAC 7860 CAAAAACACAGGAGAAGAAG AAGAAGCAAC CTGCAAAACC CAAACCCGGA AAGAGACAGC 7920 GTATGGCACTTAAGTTGGAG GCCGACAGAC TGTTCGACGT CAAAAATGAG GACGGAGATG 7980 TCATCGGGCACGCACTGGCC ATGGAAGGAA AGGTAATGAA ACCACTCCAC GTGAAAGGAA 8040 CTATTGACCACCCTGTGCTA TCAAAGCTCA AATTCACCAA GTCGTCAGCA TACGACATGG 8100 AGTTCGCACAGTTGCCGGTC AACATGAGAA GTGAGGCGTT CACCTACACC AGTGAACACC 8160 CTGAAGGGTTCTACAACTGG CACCACGGAG CGGTGCAGTA TAGTGGAGGC AGATTTACCA 8220 TCCCCCGCGGAGTAGGAGGC AGAGGAGACA GTGGTCGTCC GATTATGGAT AACTCAGGCC 8280 GGGTTGTCGCGATAGTCCTC GGAGGGGCTG ATGAGGGAAC AAGAACCGCC CTTTCGGTCG 8340 TCACCTGGAATAGCAAAGGG AAGACAATCA AGACAACCCC GGAAGGGACA GAAGAGTGGT 8400 CTGCTGCACCACTGGTCACG GCCATGTGCT TGCTTGGAAA CGTGAGCTTC CCATGCAATC 8460 GCCCGCCCACATGCTACACC CGCGAACCAT CCAGAGCTCT CGACATCCTC GAAGAGAACG 8520 TGAACCACGAGGCCTACGAC ACCCTGCTCA ACGCCATATT GCGGTGCGGA TCGTCCGGCA 8580 GAAGTAAAAGAAGCGTCACT GACGACTTTA CCTTGACCAG CCCGTACTTG GGCACATGCT 8640 CGTACTGTCACCATACTGAA CCGTGCTTTA GCCCGATTAA GATCGAGCAG GTCTGGGATG 8700 AAGCGGACGACAACACCATA CGCATACAGA CTTCCGCCCA GTTTGGATAC GACCAAAGCG 8760 GAGCAGCAAGCTCAAATAAG TACCGCTACA TGTCGCTCGA GCAGGATCAT ACTGTCAAAG 8820 AAGGCACCATGGATGACATC AAGATCAGCA CCTCAGGACC GTGTAGAAGG CTTAGCTACA 8880 AAGGATACTTTCTCCTCGCG AAGTGTCCTC CAGGGGACAG CGTAACGGTT AGCATAGCGA 8940 GTAGCAACTCAGCAACGTCA TGCACAATGG CCCGCAAGAT AAAACCAAAA TTCGTGGGAC 9000 GGGAAAAATATGACCTACCT CCCGTTCACG GTAAGAAGAT TCCTTGCACA GTGTACGACC 9060 GTCTGAAAGAAACAACCGCC GGCTACATCA CTATGCACAG GCCGGGACCG CACGCCTATA 9120 CATCCTATCTGGAGGAATCA TCAGGGAAAG TTTACGCGAA GCCACCATCC GGGAAGAACA 9180 TTACGTACGAGTGCAAGTGC GGCGATTACA AGACCGGAAC CGTTACGACC CGTACCGAAA 9240 TCACGGGCTGCACCGCCATC AAGCAGTGCG TCGCCTATAA GAGCGACCAA ACGAAGTGGG 9300 TCTTCAACTCGCCGGACTCG ATCAGACACG CCGACCACAC GGCCCAAGGG AAATTGCATT 9360 TGCCTTTCAAGCTGATCCCG AGTACCTGCA TGGTCCCTGT TGCCCACGCG CCGAACGTAG 9420 TACACGGCTTTAAACACATC AGCCTCCAAT TAGACACAGA CCATCTGACA TTGCTCACCA 9480 CCAGGAGACTAGGGGCAAAC CCGGAACCAA CCACTGAATG GATCATCGGA AACACGGTTA 9540 GAAACTTCACCGTCGACCGA GATGGCCTGG AATACATATG GGGCAATCAC GAACCAGTAA 9600 GGGTCTATGCCCAAGAGTCT GCACCAGGAG ACCCTCACGG ATGGCCACAC GAAATAGTAC 9660 AGCATTACTATCATCGCCAT CCTGTGTACA CCATCTTAGC CGTCGCATCA GCTGCTGTGG 9720 CGATGATGATTGGCGTAACT GTTGCAGCAT TATGTGCCTG TAAAGCGCGC CGTGAGTGCC 9780 TGACGCCATATGCCCTGGCC CCAAATGCCG TGATTCCAAC TTCGCTGGCA CTTTTGTGCT 9840 GTGTTAGGTCGGCTAATGCT GAAACATTCA CCGAGACCAT GAGTTACTTA TGGTCGAACA 9900 GCCAGCCGTTCTTCTGGGTC CAGCTGTGTA TACCTCTGGC CGCTGTCGTC GTTCTAATGC 9960 GCTGTTGCTCATGCTGCCTG CCTTTTTTAG TGGTTGCCGG CGCCTACCTG GCGAAGGTAG 10020 ACGCCTACGAACATGCGACC ACTGTTCCAA ATGTGCCACA GATACCGTAT AAGGCACTTG 10080 TTGAAAGGGCAGGGTACGCC CCGCTCAATT TGGAGATTAC TGTCATGTCC TCGGAGGTTT 10140 TGCCTTCCACCAACCAAGAG TACATTACCT GCAAATTCAC CACTGTGGTC CCCTCCCCTA 10200 AAGTCAGATGCTGCGGCTCC TTGGAATGTC AGCCCGCCGC TCACGCAGAC TATACCTGCA 10260 AGGTCTTTGGAGGGGTGTAC CCCTTCATGT GGGGAGGAGC ACAATGTTTT TGCGACAGTG 10320 AGAACAGCCAGATGAGTGAG GCGTACGTCG AATTGTCAGT AGATTGCGCG ACTGACCACG 10380 CGCAGGCGATTAAGGTGCAT ACTGCCGCGA TGAAAGTAGG ACTGCGTATA GTGTACGGGA 10440 ACACTACCAGTTTCCTAGAT GTGTACGTGA ACGGAGTCAC ACCAGGAACG TCTAAAGACC 10500 TGAAAGTCATAGCTGGACCA ATTTCAGCAT TGTTTACACC ATTCGATCAC AAGGTCGTTA 10560 TCAATCGCGGCCTGGTGTAC AACTATGACT TTCCGGAATA CGGAGCGATG AAACCAGGAG 10620 CGTTTGGAGACATTCAAGCT ACCTCCTTGA CTAGCAAAGA CCTCATCGCC AGCACAGACA 10680 TTAGGCTACTCAAGCCTTCC GCCAAGAACG TGCATGTCCC GTACACGCAG GCCGCATCTG 10740 GATTCGAGATGTGGAAAAAC AACTCAGGCC GCCCACTGCA GGAAACCGCC CCTTTTGGGT 10800 GCAAGATTGCAGTCAATCCG CTTCGAGCGG TGGACTGCTC ATACGGGAAC ATTCCCATTT 10860 CTATTGACATCCCGAACGCT GCCTTTATCA GGACATCAGA TGCACCACTG GTCTCAACAG 10920 TCAAATGTGATGTCAGTGAG TGCACTTATT CAGCGGACTT CGGAGGGATG GCTACCCTGC 10980 AGTATGTATCCGACCGCGAA GGACAATGCC CTGTACATTC GCATTCGAGC ACAGCAACCC 11040 TCCAAGAGTCGACAGTTCAT GTCCTGGAGA AAGGAGCGGT GACAGTACAC TTCAGCACCG 11100 CGAGCCCACAGGCGAACTTC ATTGTATCGC TGTGTGGTAA GAAGACAACA TGCAATGCAG 11160 AATGCAAACCACCAGCTGAT CATATCGTGA GCACCCCGCA CAAAAATGAC CAAGAATTCC 11220 AAGCCGCCATCTCAAAAACT TCATGGAGTT GGCTGTTTGC CCTTTTCGGC GGCGCCTCGT 11280 CGCTATTAATTATAGGACTT ATGATTTTTG CTTGCAGCAT GATGCTGACT AGCACACGAA 11340 GATGACCGCTACGCCCCAAT GACCCGACCA GCAAAACTCG ATGTACTTCC GAGGAACTGA 11400 TGTGCATAATGCATCAGGCT GGTATATTAG ATCCCCGCTT ACCGCGGGCA ATATAGCAAC 11460 ACCAAAACTCGACGTATTTC CGAGGAAGCG CAGTGCATAA TGCTGCGCAG TGTTGCCAAA 11520 TAATCACTATATTAACCATT TATTCAGCGG ACGCCAAAAC TCAATGTATT TCTGAGGAAG 11580 CATGGTGCATAATGCCATGC AGCGTCTGCA TAACTTTTTA TTATTTCTTT TATTAATCAA 11640 CAAAATTTTGTTTTTAACAT TTC 11663 11703 base pairs nucleic acid double linear cDNA 8ATTGGCGGCG TAGTACACAC TATTGAATCA AACAGCCGAC CAATTGCACT ACCATCACA 59 ATGGAG AAG CCA GTA GTA AAC GTA GAC GTA GAC CCC CAG AGT CCG TTT 107 GTC GTGCAA CTG CAA AAA AGC TTC CCG CAA TTT GAG GTA GTA GCA CAG 155 CAG GTC ACTCCA AAT GAC CAT GCT AAT GCC AGA GCA TTT TCG CAT CTG 203 GCC AGT AAA CTAATC GAG CTG GAG GTT CCT ACC ACA GCG ACG ATC TTG 251 GAC ATA GGC AGC GCACCG GCT CGT AGA ATG TTT TCC GAG CAC CAG TAT 299 CAT TGT GTC TGC CCC ATGCGT AGT CCA GAA GAC CCG GAC CGC ATG ATG 347 AAA TAT GCC AGT AAA CTG GCGGAA AAA GCG TGC AAG ATT ACA AAC AAG 395 AAC TTG CAT GAG AAG ATT AAG GATCTC CGG ACC GTA CTT GAT ACG CCG 443 GAT GCT GAA ACA CCA TCG CTC TGC TTTCAC AAC GAT GTT ACC TGC AAC 491 ATG CGT GCC GAA TAT TCC GTC ATG CAG GACGTG TAT ATC AAC GCT CCC 539 GGA ACT ATC TAT CAT CAG GCT ATG AAA GGC GTGCGG ACC CTG TAC TGG 587 ATT GGC TTC GAC ACC ACC CAG TTC ATG TTC TCG GCTATG GCA GGT TCG 635 TAC CCT GCG TAC AAC ACC AAC TGG GCC GAC GAG AAA GTCCTT GAA GCG 683 CGT AAC ATC GGA CTT TGC AGC ACA AAG CTG AGT GAA GGT AGGACA GGA 731 AAA TTG TCG ATA ATG AGG AAG AAG GAG TTG AAG CCC GGG TCG CGGGTT 779 TAT TTC TCC GTA GGA TCG ACA CTT TAT CCA GAA CAC AGA GCC AGC TTG827 CAG AGC TGG CAT CTT CCA TCG GTG TTC CAC TTG AAT GGA AAG CAG TCG 875TAC ACT TGC CGC TGT GAT ACA GTG GTG AGT TGC GAA GGC TAC GTA GTG 923 AAGAAA ATC ACC ATC AGT CCC GGG ATC ACG GGA GAA ACC GTG GGA TAC 971 GCG GTTACA CAC AAT AGC GAG GGC TTC TTG CTA TGC AAA GTT ACT GAC 1019 ACA GTA AAAGGA GAA CGG GTA TCG TTC CCT GTG TGC ACG TAC ATC CCG 1067 GCC ACC ATA TGCGAT CAG ATG ACT GGT ATA ATG GCC ACG GAT ATA TCA 1115 CCT GAC GAT GCA CAAAAA CTT CTG GTT GGG CTC AAC CAG CGA ATT GTC 1163 ATT AAC GGT AGG ACT AACAGG AAC ACC AAC ACC ATG CAA AAT TAC CTT 1211 CTG CCG ATC ATA GCA CAA GGGTTC AGC AAA TGG GCT AAG GAG CGC AAG 1259 GAT GAT CTT GAT AAC GAG AAA ATGCTG GGT ACT AGA GAA CGC AAG CTT 1307 ACG TAT GGC TGC TTG TGG GCG TTT CGCACT AAG AAA GTA CAT TCG TTT 1355 TAT CGC CCA CCT GGA ACG CAG ACC ATC GTAAAA GTC CCA GCC TCT TTT 1403 AGC GCT TTT CCC ATG TCG TCC GTA TGG ACG ACCTCT TTG CCC ATG TCG 1451 CTG AGG CAG AAA TTG AAA CTG GCA TTG CAA CCA AAGAAG GAG GAA AAA 1499 CTG CTG CAG GTC TCG GAG GAA TTA GTC ATG GAG GCC AAGGCT GCT TTT 1547 GAG GAT GCT CAG GAG GAA GCC AGA GCG GAG AAG CTC CGA GAAGCA CTT 1595 CCA CCA TTA GTG GCA GAC AAA GGC ATC GAG GCA GCC GCA GAA GTTGTC 1643 TGC GAA GTG GAG GGG CTC CAG GCG GAC ATC GGA GCA GCA TTA GTT GAA1691 ACC CCG CGC GGT CAC GTA AGG ATA ATA CCT CAA GCA AAT GAC CGT ATG1739 ATC GGA CAG TAT ATC GTT GTC TCG CCA AAC TCT GTG CTG AAG AAT GCC1787 AAA CTC GCA CCA GCG CAC CCG CTA GCA GAT CAG GTT AAG ATC ATA ACA1835 CAC TCC GGT AGA TCA GGA AGG TAC GCG GTC GAA CCA TAC GAC GCT AAA1883 GTA CTG ATG CCA GCA GGA GGT GCC GTA CCA TGG CCA GAA TTC CTA GCA1931 CTG AGT GAG AGC GCC ACG TTA GTG TAC AAC GAA AGA GAG TTT GTG AAC1979 CGC AAA CTA TAC CAC ATT GCC ATG CAT GGC CCC GCC AAG AAT ACA GAA2027 GAG GAG CAG TAC AAG GTT ACA AAG GCA GAG CTT GCA GAA ACA GAG TAC2075 GTG TTT GAC GTG GAC AAG AAG CGT TGC GTT AAG AAG GAA GAA GCC TCA2123 GGT CTG GTC CTC TCG GGA GAA CTG ACC AAC CCT CCC TAT CAT GAG CTA2171 GCT CTG GAG GGA CTG AAG ACC CGA CCT GCG GTC CCG TAC AAG GTC GAA2219 ACA ATA GGA GTG ATA GGC ACA CCG GGG TCG GGC AAG TCA GCT ATT ATC2267 AAG TCA ACT GTC ACG GCA CGG GAT CTT GTT ACC AGC GGA AAG AAA GAA2315 AAT TGT CGC GAA ATT GAG GCC GAC GTG CTA AGA CTG AGG GGT ATG CAG2363 ATT ACG TCG AAG ACA GTA GAT TCG GTT ATG CTC AAC GGA TGC CAC AAA2411 GCC GTA GAA GTG CTG TAC GTT GAC GAA GCG TTC GCG TGC CAC GCA GGA2459 GCA CTA CTT GCC TTG ATT GCT ATC GTC AGG CCC CGC AAG AAG GTA GTA2507 CTA TGC GGA GAC CCC ATG CAA TGC GGA TTC TTC AAC ATG ATG CAA CTA2555 AAG GTA CAT TTC AAT CAC CCT GAA AAA GAC ATA TGC ACC AAG ACA TTC2603 TAC AAG TAT ATC TCC CGG CGT TGC ACA CAG CCA GTT ACA GCT ATT GTA2651 TCG ACA CTG CAT TAC GAT GGA AAG ATG AAA ACC ACG AAC CCG TGC AAG2699 AAG AAC ATT GAA ATC GAT ATT ACA GGG GCC ACA AAG CCG AAG CCA GGG2747 GAT ATC ATC CTG ACA TGT TTC CGC GGG TGG GTT AAG CAA TTG CAA ATC2795 GAC TAT CCC GGA CAT GAA GTA ATG ACA GCC GCG GCC TCA CAA GGG CTA2843 ACC AGA AAA GGA GTG TAT GCC GTC CGG CAA AAA GTC AAT GAA AAC CCA2891 CTG TAC GCG ATC ACA TCA GAG CAT GTG AAC GTG TTG CTC ACC CGC ACT2939 GAG GAC AGG CTA GTG TGG AAA ACC TTG CAG GGC GAC CCA TGG ATT AAG2987 CAG CTC ACT AAC ATA CCT AAA GGA AAC TTT CAG GCT ACT ATA GAG GAC3035 TGG GAA GCT GAA CAC AAG GGA ATA ATT GCT GCA ATA AAC AGC CCC ACT3083 CCC CGT GCC AAT CCG TTC AGC TGC AAG ACC AAC GTT TGC TGG GCG AAA3131 GCA TTG GAA CCG ATA CTA GCC ACG GCC GGT ATC GTA CTT ACC GGT TGC3179 CAG TGG AGC GAA CTG TTC CCA CAG TTT GCG GAT GAC AAA CCA CAT TCG3227 GCC ATT TAC GCC TTA GAC GTA ATT TGC ATT AAG TTT TTC GGC ATG GAC3275 TTG ACA AGC GGA CTG TTT TCT AAA CAG AGC ATC CCA CTA ACG TAC CAT3323 CCC GCC GAT TCA GCG AGG CCG GTA GCT CAT TGG GAC AAC AGC CCA GGA3371 ACC CGC AAG TAT GGG TAC GAT CAC GCC ATT GCC GCC GAA CTC TCC CGT3419 AGA TTT CCG GTG TTC CAG CTA GCT GGG AAG GGC ACA CAA CTT GAT TTG3467 CAG ACG GGG AGA ACC AGA GTT ATC TCT GCA CAG CAT AAC CTG GTC CCG3515 GTG AAC CGC AAT CTT CCT CAC GCC TTA GTC CCC GAG TAC AAG GAG AAG3563 CAA CCC GGC CCG GTC GAA AAA TTC TTG AAC CAG TTC AAA CAC CAC TCA3611 GTA CTT GTG GTA TCA GAG GAA AAA ATT GAA GCT CCC CGT AAG AGA ATC3659 GAA TGG ATC GCC CCG ATT GGC ATA GCC GGT GCA GAT AAG AAC TAC AAC3707 CTG GCT TTC GGG TTT CCG CCG CAG GCA CGG TAC GAC CTG GTG TTC ATC3755 AAC ATT GGA ACT AAA TAC AGA AAC CAC CAC TTT CAG CAG TGC GAA GAC3803 CAT GCG GCG ACC TTA AAA ACC CTT TCG CGT TCG GCC CTG AAT TGC CTT3851 AAC CCA GGA GGC ACC CTC GTG GTG AAG TCC TAT GGC TAC GCC GAC CGC3899 AAC AGT GAG GAC GTA GTC ACC GCT CTT GCC AGA AAG TTT GTC AGG GTG3947 TCC GCA GCG AGA CCA GAT TGT GTC TCA AGC AAT ACA GAA ATG TAC CTG3995 ATT TTC CGA CAA CTA GAC AAC AGC CGT ACA CGG CAA TTC ACC CCG CAC4043 CAT CTG AAT TGC GTG ATT TCG TCC GTG TAT GAG GGT ACA AGA GAT GGA4091 GTT GGA GCC GCG CCG TCA TAC CGC ACC AAA AGG GAG AAT ATT GCT GAC4139 TGT CAA GAG GAA GCA GTT GTC AAC GCA GCC AAT CCG CTG GGT AGA CCA4187 GGC GAA GGA GTC TGC CGT GCC ATC TAT AAA CGT TGG CCG ACC AGT TTT4235 ACC GAT TCA GCC ACG GAG ACA GGC ACC GCA AGA ATG ACT GTG TGC CTA4283 GGA AAG AAA GTG ATC CAC GCG GTC GGC CCT GAT TTC CGG AAG CAC CCA4331 GAA GCA GAA GCC TTG AAA TTG CTA CAA AAC GCC TAC CAT GCA GTG GCA4379 GAC TTA GTA AAT GAA CAT AAC ATC AAG TCT GTC GCC ATT CCA CTG CTA4427 TCT ACA GGC ATT TAC GCA GCC GGA AAA GAC CGC CTT GAA GTA TCA CTT4475 AAC TGC TTG ACA ACC GCG CTA GAC AGA ACT GAC GCG GAC GTA ACC ATC4523 TAT TGC CTG GAT AAG AAG TGG AAG GAA AGA ATC GAC GCG GCA CTC CAA4571 CTT AAG GAG TCT GTA ACA GAG CTG AAG GAT GAA GAT ATG GAG ATC GAC4619 GAT GAG TTA GTA TGG ATC CAT CCA GAC AGT TGC TTG AAG GGA AGA AAG4667 GGA TTC AGT ACT ACA AAA GGA AAA TTG TAT TCG TAC TTC GAA GGC ACC4715 AAA TTC CAT CAA GCA GCA AAA GAC ATG GCG GAG ATA AAG GTC CTG TTC4763 CCT AAT GAC CAG GAA AGT AAT GAA CAA CTG TGT GCC TAC ATA TTG GGT4811 GAG ACC ATG GAA GCA ATC CGC GAA AAG TGC CCG GTC GAC CAT AAC CCG4859 TCG TCT AGC CCG CCC AAA ACG TTG CCG TGC CTT TGC ATG TAT GCC ATG4907 ACG CCA GAA AGG GTC CAC AGA CTT AGA AGC AAT AAC GTC AAA GAA GTT4955 ACA GTA TGC TCC TCC ACC CCC CTT CCT AAG CAC AAA ATT AAG AAT GTT5003 CAG AAG GTT CAG TGC ACG AAA GTA GTC CTG TTT AAT CCG CAC ACT CCC5051 GCA TTC GTT CCC GCC CGT AAG TAC ATA GAA GTG CCA GAA CAG CCT ACC5099 GCT CCT CCT GCA CAG GCC GAG GAG GCC CCC GAA GTT GTA GCG ACA CCG5147 TCA CCA TCT ACA GCT GAT AAC ACC TCG CTT GAT GTC ACA GAC ATC TCA5195 CTG GAT ATG GAT GAC AGT AGC GAA GGC TCA CTT TTT TCG AGC TTT AGC5243 GGA TCG GAC AAC TCT ATT ACT AGT ATG GAC AGT TGG TCG TCA GGA CCT5291 AGT TCA CTA GAG ATA GTA GAC CGA AGG CAG GTG GTG GTG GCT GAC GTT5339 CAT GCC GTC CAA GAG CCT GCC CCT ATT CCA CCG CCA AGG CTA AAG AAG5387 ATG GCC CGC CTG GCA GCG GCA AGA AAA GAG CCC ACT CCA CCG GCA AGC5435 AAT AGC TCT GAG TCC CTC CAC CTC TCT TTT GGT GGG GTA TCC ATG TCC5483 CTC GGA TCA ATT TTC GAC GGA GAG ACG GCC CGC CAG GCA GCG GTA CAA5531 CCC CTG GCA ACA GGC CCC ACG GAT GTG CCT ATG TCT TTC GGA TCG TTT5579 TCC GAC GGA GAG ATT GAT GAG CTG AGC CGC AGA GTA ACT GAG TCC GAA5627 CCC GTC CTG TTT GGA TCA TTT GAA CCG GGC GAA GTG AAC TCA ATT ATA5675 TCG TCC CGA TCA GCC GTA TCT TTT CCA CTA CGC AAG CAG AGA CGT AGA5723 CGC AGG AGC AGG AGG ACT GAA TAC TGA CTA ACC GGG GTA GGT GGG TAC5771 ATA TTT TCG ACG GAC ACA GGC CCT GGG CAC TTG CAA AAG AAG TCC GTT5819 CTG CAG AAC CAG CTT ACA GAA CCG ACC TTG GAG CGC AAT GTC CTG GAA5867 AGA ATT CAT GCC CCG GTG CTC GAC ACG TCG AAA GAG GAA CAA CTC AAA5915 CTC AGG TAC CAG ATG ATG CCC ACC GAA GCC AAC AAA AGT AGG TAC CAG5963 TCT CGT AAA GTA GAA AAT CAG AAA GCC ATA ACC ACT GAG CGA CTA CTG6011 TCA GGA CTA CGA CTG TAT AAC TCT GCC ACA GAT CAG CCA GAA TGC TAT6059 AAG ATC ACC TAT CCG AAA CCA TTG TAC TCC AGT AGC GTA CCG GCG AAC6107 TAC TCC GAT CCA CAG TTC GCT GTA GCT GTC TGT AAC AAC TAT CTG CAT6155 GAG AAC TAT CCG ACA GTA GCA TCT TAT CAG ATT ACT GAC GAG TAC GAT6203 GCT TAC TTG GAT ATG GTA GAC GGG ACA GTC GCC TGC CTG GAT ACT GCA6251 ACC TTC TGC CCC GCT AAG CTT AGA AGT TAC CCG AAA AAA CAT GAG TAT6299 AGA GCC CCG AAT ATC CGC AGT GCG GTT CCA TCA GCG ATG CAG AAC ACG6347 CTA CAA AAT GTG CTC ATT GCC GCA ACT AAA AGA AAT TGC AAC GTC ACG6395 CAG ATG CGT GAA CTG CCA ACA CTG GAC TCA GCG ACA TTC AAT GTC GAA6443 TGC TTT CGA AAA TAT GCA TGT AAT GAC GAG TAT TGG GAG GAG TTC GCT6491 CGG AAG CCA ATT AGG ATT ACC ACT GAG TTT GTC ACC GCA TAT GTA GCT6539 AGA CTG AAA GGC CCT AAG GCC GCC GCA CTA TTT GCA AAG ACG TAT AAT6587 TTG GTC CCA TTG CAA GAA GTG CCT ATG GAT AGA TTC GTC ATG GAC ATG6635 AAA AGA GAC GTG AAA GTT ACA CCA GGC ACG AAA CAC ACA GAA GAA AGA6683 CCG AAA GTA CAA GTG ATA CAA GCC GCA GAA CCC CTG GCG ACT GCT TAC6731 TTA TGC GGG ATT CAC CGG GAA TTA GTG CGT AGG CTT ACG GCC GTC TTG6779 CTT CCA AAC ATT CAC ACG CTT TTT GAC ATG TCG GCG GAG GAT TTT GAT6827 GCA ATC ATA GCA GAA CAC TTC AAG CAA GGC GAC CCG GTA CTG GAG ACG6875 GAT ATC GCA TCA TTC GAC AAA AGC CAA GAC GAC GCT ATG GCG TTA ACC6923 GGT CTG ATG ATC TTG GAG GAC CTG GGT GTG GAT CAA CCA CTA CTC GAC6971 TTG ATC GAG TGC GCC TTT GGA GAA ATA TCA TCC ACC CAT CTA CCT ACG7019 GGT ACT CGT TTT AAA TTC GGG GCG ATG ATG AAA TCC GGA ATG TTC CTC7067 ACA CTT TTT GTC AAC ACA GTT TTG AAT GTC GTT ATC GCC AGC AGA GTA7115 CTA GAA GAG CGG CTT AAA ACG TCC AGA TGT GCA GCG TTC ATT GGC GAC7163 GAC AAC ATC ATA CAT GGA GTA GTA TCT GAC AAA GAA ATG GCT GAG AGG7211 TGC GCC ACC TGG CTC AAC ATG GAG GTT AAG ATC ATC GAC GCA GTC ATC7259 GGT GAG AGA CCA CCT TAC TTC TGC GGC GGA TTT ATC TTG CAA GAT TCG7307 GTT ACT TCC ACA GCG TGC CGC GTG GCG GAC CCC CTG AAA AGG CTG TTT7355 AAG TTG GGT AAA CCG CTC CCA GCC GAC GAC GAG CAA GAC GAA GAC AGA7403 AGA CGC GCT CTG CTA GAT GAA ACA AAG GCG TGG TTT AGA GTA GGT ATA7451 ACA GGC ACT TTA GCA GTG GCC GTG ACG ACC CGG TAT GAG GTA GAC AAT7499 ATT ACA CCT GTC CTA CTG GCA TTG AGA ACT TTT GCC CAG AGC AAA AGA7547 GCA TTC CAA GCC ATC AGA GGG GAA ATA AAG CAT CTC TAC GGT GGT CCT7595 AAA TAGTCAGCAT AGTACATTTC ATCTGACTAA TACTACAACA CCACCACC ATG AAT7652 AGA GGA TTC TTT AAC ATG CTC GGC CGC CGC CCC TTC CCG GCC CCC ACT7700 GCC ATG TGG AGG CCG CGG AGA AGG AGG CAG GCG GCC CCG ATG CCT GCC7748 CGC AAC GGG CTG GCT TCT CAA ATC CAG CAA CTG ACC ACA GCC GTC AGT7796 GCC CTA GTC ATT GGA CAG GCA ACT AGA CCT CAA CCC CCA CGT CCA CGC7844 CCG CCA CCG CGC CAG AAG AAG CAG GCG CCC AAG CAA CCA CCG AAG CCG7892 AAG AAA CCA AAA ACG CAG GAG AAG AAG AAG AAG CAA CCT GCA AAA CCC7940 AAA CCC GGA AAG AGA CAG CGC ATG GCA CTT AAG TTG GAG GCC GAC AGA7988 TTG TTC GAC GTC AAG AAC GAG GAC GGA GAT GTC ATC GGG CAC GCA CTG8036 GCC ATG GAA GGA AAG GTA ATG AAA CCT CTG CAC GTG AAA GGA ACC ATC8084 GAC CAC CCT GTG CTA TCA AAG CTC AAA TTT ACC AAG TCG TCA GCA TAC8132 GAC ATG GAG TTC GCA CAG TTG CCA GTC AAC ATG AGA AGT GAG GCA TTC8180 ACC TAC ACC AGT GAA CAC CCC GAA GGA TTC TAT AAC TGG CAC CAC GGA8228 GCG GTG CAG TAT AGT GGA GGT AGA TTT ACC ATC CCT CGC GGA GTA GGA8276 GGC AGA GGA GAC AGC GGT CGT CCG ATC ATG GAT AAC TCC GGT CGG GTT8324 GTC GCG ATA GTC CTC GGT GGA GCT GAT GAA GGA ACA CGA ACT GCC CTT8372 TCG GTC GTC ACC TGG AAT AGT AAA GGG AAG ACA ATT AAG ACG ACC CCG8420 GAA GGG ACA GAA GAG TGG TCC GCA GCA CCA CTG GTC ACG GCA ATG TGT8468 TTG CTC GGA AAT GTG AGC TTC CCA TGC GAC CGC CCG CCC ACA TGC TAT8516 ACC CGC GAA CCT TCC AGA GCC CTC GAC ATC CTT GAA GAG AAC GTG AAC8564 CAT GAG GCC TAC GAT ACC CTG CTC AAT GCC ATA TTG CGG TGC GGA TCG8612 TCT GGC AGA AGC AAA AGA AGC GTC ACT GAC GAC TTT ACC CTG ACC AGC8660 CCC TAC TTG GGC ACA TGC TCG TAC TGC CAC CAT ACT GAA CCG TGC TTC8708 AGC CCT GTT AAG ATC GAG CAG GTC TGG GAC GAA GCG GAC GAT AAC ACC8756 ATA CGC ATA CAG ACT TCC GCC CAG TTT GGA TAC GAC CAA AGC GGA GCA8804 GCA AGC GCA AAC AAG TAC CGC TAC ATG TCG CTT GAG CAG GAT CAC ACC8852 GTT AAA GAA GGC ACC ATG GAT GAC ATC AAG ATT AGC ACC TCA GGA CCG8900 TGT AGA AGG CTT AGC TAC AAA GGA TAC TTT CTC CTC GCA AAA TGC CCT8948 CCA GGG GAC AGC GTA ACG GTT AGC ATA GTG AGT AGC AAC TCA GCA ACG8996 TCA TGT ACA CTG GCC CGC AAG ATA AAA CCA AAA TTC GTG GGA CGG GAA9044 AAA TAT GAT CTA CCT CCC GTT CAC GGT AAA AAA ATT CCT TGC ACA GTG9092 TAC GAC CGT CTG AAA GAA ACA ACT GCA GGC TAC ATC ACT ATG CAC AGG9140 CCG GGA CCG CAC GCT TAT ACA TCC TAC CTG GAA GAA TCA TCA GGG AAA9188 GTT TAC GCA AAG CCG CCA TCT GGG AAG AAC ATT ACG TAT GAG TGC AAG9236 TGC GGC GAC TAC AAG ACC GGA ACC GTT TCG ACC CGC ACC GAA ATC ACT9284 GGT TGC ACC GCC ATC AAG CAG TGC GTC GCC TAT AAG AGC GAC CAA ACG9332 AAG TGG GTC TTC AAC TCA CCG GAC TTG ATC AGA CAT GAC GAC CAC ACG9380 GCC CAA GGG AAA TTG CAT TTG CCT TTC AAG TTG ATC CCG AGT ACC TGC9428 ATG GTC CCT GTT GCC CAC GCG CCG AAT GTA ATA CAT GGC TTT AAA CAC9476 ATC AGC CTC CAA TTA GAT ACA GAC CAC TTG ACA TTG CTC ACC ACC AGG9524 AGA CTA GGG GCA AAC CCG GAA CCA ACC ACT GAA TGG ATC GTC GGA AAG9572 ACG GTC AGA AAC TTC ACC GTC GAC CGA GAT GGC CTG GAA TAC ATA TGG9620 GGA AAT CAT GAG CCA GTG AGG GTC TAT GCC CAA GAG TCA GCA CCA GGA9668 GAC CCT CAC GGA TGG CCA CAC GAA ATA GTA CAG CAT TAC TAC CAT CGC9716 CAT CCT GTG TAC ACC ATC TTA GCC GTC GCA TCA GCT ACC GTG GCG ATG9764 ATG ATT GGC GTA ACC GTT GCA GTG TTA TGT GCC TGT AAA GCG CGC CGT9812 GAG TGC CTG ACG CCA TAC GCC CTG GCC CCA AAC GCC GTA ATC CCA ACT9860 TCG CTG GCA CTC TTG TGC TGC GTT AGG TCG GCC AAT GCT GAA ACG TTC9908 ACC GAG ACC ATG AGT TAC TTG TGG TCG AAC AGT CAG CCG TTC TTC TGG9956 GTC CAG TTG TGC ATA CCT TTG GCC GCT TTC ATC GTT CTA ATG CGC TGC10004 TGC TCC TGC TGC CTG CCT TTT TTA GTG GTT GCC GGC GCC TAC CTG GCG10052 AAG GTA GAC GCC TAC GAA CAT GCG ACC ACT GTT CCA AAT GTG CCA CAG10100 ATA CCG TAT AAG GCA CTT GTT GAA AGG GCA GGG TAT GCC CCG CTC AAT10148 TTG GAG ATC ACT GTC ATG TCC TCG GAG GTT TTG CCT TCC ACC AAC CAA10196 GAG TAC ATT ACC TGC AAA TTC ACC ACT GTG GTC CCC TCC CCA AAA ATC10244 AAA TGC TGC GGC TCC TTG GAA TGT CAG CCG GCC GCT CAT GCA GAC TAT10292 ACC TGC AAG GTC TTC GGA GGG GTC TAC CCC TTT ATG TGG GGA GGA GCG10340 CAA TGT TTT TGC GAC AGT GAG AAC AGC CAG ATG AGT GAG GCG TAC GTC10388 GAA CTG TCA GCA GAT TGC GCG TCT GAC CAC GCG CAG GCG ATT AAG GTG10436 CAC ACT GCC GCG ATG AAA GTA GGA CTG CGT ATA GTG TAC GGG AAC ACT10484 ACC AGT TTC CTA GAT GTG TAC GTG AAC GGA GTC ACA CCA GGA ACG TCT10532 AAA GAC TTG AAA GTC ATA GCT GGA CCA ATT TCA GCA TCG TTT ACG CCA10580 TTC GAT CAT AAG GTC GTT ATC CAT CGC GGC CTG GTG TAC AAC TAT GAC10628 TTC CCG GAA TAT GGA GCG ATG AAA CCA GGA GCG TTT GGA GAC ATT CAA10676 GCT ACC TCC TTG ACT AGC AAG GAT CTC ATC GCC AGC ACA GAC ATT AGG10724 CTA CTC AAG CCT TCC GCC AAG AAC GTG CAT GTC CCG TAC ACG CAG GCC10772 GCA TCA GGA TTT GAG ATG TGG AAA AAC AAC TCA GGC CGC CCA CTG CAG10820 GAA ACC GCA CCT TTC GGG TGT AAG ATT GCA GTA AAT CCG CTC CGA GCG10868 GTG GAC TGT TCA TAC GGG AAC ATT CCC ATT TCT ATT GAC ATC CCG AAC10916 GCT GCC TTT ATC AGG ACA TCA GAT GCA CCA CTG GTC TCA ACA GTC AAA10964 TGT GAA GTC AGT GAG TGC ACT TAT TCA GCA GAC TTC GGC GGG ATG GCC11012 ACC CTG CAG TAT GTA TCC GAC CGC GAA GGT CAA TGC CCC GTA CAT TCG11060 CAT TCG AGC ACA GCA ACT CTC CAA GAG TCG ACA GTA CAT GTC CTG GAG11108 AAA GGA GCG GTG ACA GTA CAC TTT AGC ACC GCG AGT CCA CAG GCG AAC11156 TTT ATC GTA TCG CTG TGT GGG AAG AAG ACA ACA TGC AAT GCA GAA TGT11204 AAA CCA CCA GCT GAC CAT ATC GTG AGC ACC CCG CAC AAA AAT GAC CAA11252 GAA TTT CAA GCC GCC ATC TCA AAA ACA TCA TGG AGT TGG CTG TTT GCC11300 CTT TTC GGC GGC GCC TCG TCG CTA TTA ATT ATA GGA CTT ATG ATT TTT11348 GCT TGC AGC ATG ATG CTG ACT AGC ACA CGA AGA TGACCGCTAC GCCCCAATGA11401 TCCGACCAGC AAAACTCGAT GTACTTCCGA GGAACTGATG TGCATAATGC ATCAGGCTGG11461 TACATTAGAT CCCCGCTTAC CGCGGGCAAT ATAGCAACAC TAAAAACTCG ATGTACTTCC11521 GAGGAAGCGC AGTGCATAAT GCTGCGCAGT GTTGCCACAT AACCACTATA TTAACCATTT11581 ATCTAGCGGA CGCCAAAAAC TCAATGTATT TCTGAGGAAG CGTGGTGCAT AATGCCACGC11641 AGCGTCTGCA TAACTTTTAT TATTTCTTTT ATTAATCAAC AAAATTTTGT TTTTAACATT11701 TC 11703 2512 amino acids amino acid linear protein 9 Met Glu LysPro Val Val Asn Val Asp Val Asp Pro Gln Ser Pro Phe 1 5 10 15 Val ValGln Leu Gln Lys Ser Phe Pro Gln Phe Glu Val Val Ala Gln 20 25 30 Gln ValThr Pro Asn Asp His Ala Asn Ala Arg Ala Phe Ser His Leu 35 40 45 Ala SerLys Leu Ile Glu Leu Glu Val Pro Thr Thr Ala Thr Ile Leu 50 55 60 Asp IleGly Ser Ala Pro Ala Arg Arg Met Phe Ser Glu His Gln Tyr 65 70 75 80 HisCys Val Cys Pro Met Arg Ser Pro Glu Asp Pro Asp Arg Met Met 85 90 95 LysTyr Ala Ser Lys Leu Ala Glu Lys Ala Cys Lys Ile Thr Asn Lys 100 105 110Asn Leu His Glu Lys Ile Lys Asp Leu Arg Thr Val Leu Asp Thr Pro 115 120125 Asp Ala Glu Thr Pro Ser Leu Cys Phe His Asn Asp Val Thr Cys Asn 130135 140 Met Arg Ala Glu Tyr Ser Val Met Gln Asp Val Tyr Ile Asn Ala Pro145 150 155 160 Gly Thr Ile Tyr His Gln Ala Met Lys Gly Val Arg Thr LeuTyr Trp 165 170 175 Ile Gly Phe Asp Thr Thr Gln Phe Met Phe Ser Ala MetAla Gly Ser 180 185 190 Tyr Pro Ala Tyr Asn Thr Asn Trp Ala Asp Glu LysVal Leu Glu Ala 195 200 205 Arg Asn Ile Gly Leu Cys Ser Thr Lys Leu SerGlu Gly Arg Thr Gly 210 215 220 Lys Leu Ser Ile Met Arg Lys Lys Glu LeuLys Pro Gly Ser Arg Val 225 230 235 240 Tyr Phe Ser Val Gly Ser Thr LeuTyr Pro Glu His Arg Ala Ser Leu 245 250 255 Gln Ser Trp His Leu Pro SerVal Phe His Leu Asn Gly Lys Gln Ser 260 265 270 Tyr Thr Cys Arg Cys AspThr Val Val Ser Cys Glu Gly Tyr Val Val 275 280 285 Lys Lys Ile Thr IleSer Pro Gly Ile Thr Gly Glu Thr Val Gly Tyr 290 295 300 Ala Val Thr HisAsn Ser Glu Gly Phe Leu Leu Cys Lys Val Thr Asp 305 310 315 320 Thr ValLys Gly Glu Arg Val Ser Phe Pro Val Cys Thr Tyr Ile Pro 325 330 335 AlaThr Ile Cys Asp Gln Met Thr Gly Ile Met Ala Thr Asp Ile Ser 340 345 350Pro Asp Asp Ala Gln Lys Leu Leu Val Gly Leu Asn Gln Arg Ile Val 355 360365 Ile Asn Gly Arg Thr Asn Arg Asn Thr Asn Thr Met Gln Asn Tyr Leu 370375 380 Leu Pro Ile Ile Ala Gln Gly Phe Ser Lys Trp Ala Lys Glu Arg Lys385 390 395 400 Asp Asp Leu Asp Asn Glu Lys Met Leu Gly Thr Arg Glu ArgLys Leu 405 410 415 Thr Tyr Gly Cys Leu Trp Ala Phe Arg Thr Lys Lys ValHis Ser Phe 420 425 430 Tyr Arg Pro Pro Gly Thr Gln Thr Ile Val Lys ValPro Ala Ser Phe 435 440 445 Ser Ala Phe Pro Met Ser Ser Val Trp Thr ThrSer Leu Pro Met Ser 450 455 460 Leu Arg Gln Lys Leu Lys Leu Ala Leu GlnPro Lys Lys Glu Glu Lys 465 470 475 480 Leu Leu Gln Val Ser Glu Glu LeuVal Met Glu Ala Lys Ala Ala Phe 485 490 495 Glu Asp Ala Gln Glu Glu AlaArg Ala Glu Lys Leu Arg Glu Ala Leu 500 505 510 Pro Pro Leu Val Ala AspLys Gly Ile Glu Ala Ala Ala Glu Val Val 515 520 525 Cys Glu Val Glu GlyLeu Gln Ala Asp Ile Gly Ala Ala Leu Val Glu 530 535 540 Thr Pro Arg GlyHis Val Arg Ile Ile Pro Gln Ala Asn Asp Arg Met 545 550 555 560 Ile GlyGln Tyr Ile Val Val Ser Pro Asn Ser Val Leu Lys Asn Ala 565 570 575 LysLeu Ala Pro Ala His Pro Leu Ala Asp Gln Val Lys Ile Ile Thr 580 585 590His Ser Gly Arg Ser Gly Arg Tyr Ala Val Glu Pro Tyr Asp Ala Lys 595 600605 Val Leu Met Pro Ala Gly Gly Ala Val Pro Trp Pro Glu Phe Leu Ala 610615 620 Leu Ser Glu Ser Ala Thr Leu Val Tyr Asn Glu Arg Glu Phe Val Asn625 630 635 640 Arg Lys Leu Tyr His Ile Ala Met His Gly Pro Ala Lys AsnThr Glu 645 650 655 Glu Glu Gln Tyr Lys Val Thr Lys Ala Glu Leu Ala GluThr Glu Tyr 660 665 670 Val Phe Asp Val Asp Lys Lys Arg Cys Val Lys LysGlu Glu Ala Ser 675 680 685 Gly Leu Val Leu Ser Gly Glu Leu Thr Asn ProPro Tyr His Glu Leu 690 695 700 Ala Leu Glu Gly Leu Lys Thr Arg Pro AlaVal Pro Tyr Lys Val Glu 705 710 715 720 Thr Ile Gly Val Ile Gly Thr ProGly Ser Gly Lys Ser Ala Ile Ile 725 730 735 Lys Ser Thr Val Thr Ala ArgAsp Leu Val Thr Ser Gly Lys Lys Glu 740 745 750 Asn Cys Arg Glu Ile GluAla Asp Val Leu Arg Leu Arg Gly Met Gln 755 760 765 Ile Thr Ser Lys ThrVal Asp Ser Val Met Leu Asn Gly Cys His Lys 770 775 780 Ala Val Glu ValLeu Tyr Val Asp Glu Ala Phe Ala Cys His Ala Gly 785 790 795 800 Ala LeuLeu Ala Leu Ile Ala Ile Val Arg Pro Arg Lys Lys Val Val 805 810 815 LeuCys Gly Asp Pro Met Gln Cys Gly Phe Phe Asn Met Met Gln Leu 820 825 830Lys Val His Phe Asn His Pro Glu Lys Asp Ile Cys Thr Lys Thr Phe 835 840845 Tyr Lys Tyr Ile Ser Arg Arg Cys Thr Gln Pro Val Thr Ala Ile Val 850855 860 Ser Thr Leu His Tyr Asp Gly Lys Met Lys Thr Thr Asn Pro Cys Lys865 870 875 880 Lys Asn Ile Glu Ile Asp Ile Thr Gly Ala Thr Lys Pro LysPro Gly 885 890 895 Asp Ile Ile Leu Thr Cys Phe Arg Gly Trp Val Lys GlnLeu Gln Ile 900 905 910 Asp Tyr Pro Gly His Glu Val Met Thr Ala Ala AlaSer Gln Gly Leu 915 920 925 Thr Arg Lys Gly Val Tyr Ala Val Arg Gln LysVal Asn Glu Asn Pro 930 935 940 Leu Tyr Ala Ile Thr Ser Glu His Val AsnVal Leu Leu Thr Arg Thr 945 950 955 960 Glu Asp Arg Leu Val Trp Lys ThrLeu Gln Gly Asp Pro Trp Ile Lys 965 970 975 Gln Leu Thr Asn Ile Pro LysGly Asn Phe Gln Ala Thr Ile Glu Asp 980 985 990 Trp Glu Ala Glu His LysGly Ile Ile Ala Ala Ile Asn Ser Pro Thr 995 1000 1005 Pro Arg Ala AsnPro Phe Ser Cys Lys Thr Asn Val Cys Trp Ala Lys 1010 1015 1020 Ala LeuGlu Pro Ile Leu Ala Thr Ala Gly Ile Val Leu Thr Gly Cys 1025 1030 10351040 Gln Trp Ser Glu Leu Phe Pro Gln Phe Ala Asp Asp Lys Pro His Ser1045 1050 1055 Ala Ile Tyr Ala Leu Asp Val Ile Cys Ile Lys Phe Phe GlyMet Asp 1060 1065 1070 Leu Thr Ser Gly Leu Phe Ser Lys Gln Ser Ile ProLeu Thr Tyr His 1075 1080 1085 Pro Ala Asp Ser Ala Arg Pro Val Ala HisTrp Asp Asn Ser Pro Gly 1090 1095 1100 Thr Arg Lys Tyr Gly Tyr Asp HisAla Ile Ala Ala Glu Leu Ser Arg 1105 1110 1115 1120 Arg Phe Pro Val PheGln Leu Ala Gly Lys Gly Thr Gln Leu Asp Leu 1125 1130 1135 Gln Thr GlyArg Thr Arg Val Ile Ser Ala Gln His Asn Leu Val Pro 1140 1145 1150 ValAsn Arg Asn Leu Pro His Ala Leu Val Pro Glu Tyr Lys Glu Lys 1155 11601165 Gln Pro Gly Pro Val Glu Lys Phe Leu Asn Gln Phe Lys His His Ser1170 1175 1180 Val Leu Val Val Ser Glu Glu Lys Ile Glu Ala Pro Arg LysArg Ile 1185 1190 1195 1200 Glu Trp Ile Ala Pro Ile Gly Ile Ala Gly AlaAsp Lys Asn Tyr Asn 1205 1210 1215 Leu Ala Phe Gly Phe Pro Pro Gln AlaArg Tyr Asp Leu Val Phe Ile 1220 1225 1230 Asn Ile Gly Thr Lys Tyr ArgAsn His His Phe Gln Gln Cys Glu Asp 1235 1240 1245 His Ala Ala Thr LeuLys Thr Leu Ser Arg Ser Ala Leu Asn Cys Leu 1250 1255 1260 Asn Pro GlyGly Thr Leu Val Val Lys Ser Tyr Gly Tyr Ala Asp Arg 1265 1270 1275 1280Asn Ser Glu Asp Val Val Thr Ala Leu Ala Arg Lys Phe Val Arg Val 12851290 1295 Ser Ala Ala Arg Pro Asp Cys Val Ser Ser Asn Thr Glu Met TyrLeu 1300 1305 1310 Ile Phe Arg Gln Leu Asp Asn Ser Arg Thr Arg Gln PheThr Pro His 1315 1320 1325 His Leu Asn Cys Val Ile Ser Ser Val Tyr GluGly Thr Arg Asp Gly 1330 1335 1340 Val Gly Ala Ala Pro Ser Tyr Arg ThrLys Arg Glu Asn Ile Ala Asp 1345 1350 1355 1360 Cys Gln Glu Glu Ala ValVal Asn Ala Ala Asn Pro Leu Gly Arg Pro 1365 1370 1375 Gly Glu Gly ValCys Arg Ala Ile Tyr Lys Arg Trp Pro Thr Ser Phe 1380 1385 1390 Thr AspSer Ala Thr Glu Thr Gly Thr Ala Arg Met Thr Val Cys Leu 1395 1400 1405Gly Lys Lys Val Ile His Ala Val Gly Pro Asp Phe Arg Lys His Pro 14101415 1420 Glu Ala Glu Ala Leu Lys Leu Leu Gln Asn Ala Tyr His Ala ValAla 1425 1430 1435 1440 Asp Leu Val Asn Glu His Asn Ile Lys Ser Val AlaIle Pro Leu Leu 1445 1450 1455 Ser Thr Gly Ile Tyr Ala Ala Gly Lys AspArg Leu Glu Val Ser Leu 1460 1465 1470 Asn Cys Leu Thr Thr Ala Leu AspArg Thr Asp Ala Asp Val Thr Ile 1475 1480 1485 Tyr Cys Leu Asp Lys LysTrp Lys Glu Arg Ile Asp Ala Ala Leu Gln 1490 1495 1500 Leu Lys Glu SerVal Thr Glu Leu Lys Asp Glu Asp Met Glu Ile Asp 1505 1510 1515 1520 AspGlu Leu Val Trp Ile His Pro Asp Ser Cys Leu Lys Gly Arg Lys 1525 15301535 Gly Phe Ser Thr Thr Lys Gly Lys Leu Tyr Ser Tyr Phe Glu Gly Thr1540 1545 1550 Lys Phe His Gln Ala Ala Lys Asp Met Ala Glu Ile Lys ValLeu Phe 1555 1560 1565 Pro Asn Asp Gln Glu Ser Asn Glu Gln Leu Cys AlaTyr Ile Leu Gly 1570 1575 1580 Glu Thr Met Glu Ala Ile Arg Glu Lys CysPro Val Asp His Asn Pro 1585 1590 1595 1600 Ser Ser Ser Pro Pro Lys ThrLeu Pro Cys Leu Cys Met Tyr Ala Met 1605 1610 1615 Thr Pro Glu Arg ValHis Arg Leu Arg Ser Asn Asn Val Lys Glu Val 1620 1625 1630 Thr Val CysSer Ser Thr Pro Leu Pro Lys His Lys Ile Lys Asn Val 1635 1640 1645 GlnLys Val Gln Cys Thr Lys Val Val Leu Phe Asn Pro His Thr Pro 1650 16551660 Ala Phe Val Pro Ala Arg Lys Tyr Ile Glu Val Pro Glu Gln Pro Thr1665 1670 1675 1680 Ala Pro Pro Ala Gln Ala Glu Glu Ala Pro Glu Val ValAla Thr Pro 1685 1690 1695 Ser Pro Ser Thr Ala Asp Asn Thr Ser Leu AspVal Thr Asp Ile Ser 1700 1705 1710 Leu Asp Met Asp Asp Ser Ser Glu GlySer Leu Phe Ser Ser Phe Ser 1715 1720 1725 Gly Ser Asp Asn Ser Ile ThrSer Met Asp Ser Trp Ser Ser Gly Pro 1730 1735 1740 Ser Ser Leu Glu IleVal Asp Arg Arg Gln Val Val Val Ala Asp Val 1745 1750 1755 1760 His AlaVal Gln Glu Pro Ala Pro Ile Pro Pro Pro Arg Leu Lys Lys 1765 1770 1775Met Ala Arg Leu Ala Ala Ala Arg Lys Glu Pro Thr Pro Pro Ala Ser 17801785 1790 Asn Ser Ser Glu Ser Leu His Leu Ser Phe Gly Gly Val Ser MetSer 1795 1800 1805 Leu Gly Ser Ile Phe Asp Gly Glu Thr Ala Arg Gln AlaAla Val Gln 1810 1815 1820 Pro Leu Ala Thr Gly Pro Thr Asp Val Pro MetSer Phe Gly Ser Phe 1825 1830 1835 1840 Ser Asp Gly Glu Ile Asp Glu LeuSer Arg Arg Val Thr Glu Ser Glu 1845 1850 1855 Pro Val Leu Phe Gly SerPhe Glu Pro Gly Glu Val Asn Ser Ile Ile 1860 1865 1870 Ser Ser Arg SerAla Val Ser Phe Pro Leu Arg Lys Gln Arg Arg Arg 1875 1880 1885 Arg ArgSer Arg Arg Thr Glu Tyr Leu Thr Gly Val Gly Gly Tyr Ile 1890 1895 1900Phe Ser Thr Asp Thr Gly Pro Gly His Leu Gln Lys Lys Ser Val Leu 19051910 1915 1920 Gln Asn Gln Leu Thr Glu Pro Thr Leu Glu Arg Asn Val LeuGlu Arg 1925 1930 1935 Ile His Ala Pro Val Leu Asp Thr Ser Lys Glu GluGln Leu Lys Leu 1940 1945 1950 Arg Tyr Gln Met Met Pro Thr Glu Ala AsnLys Ser Arg Tyr Gln Ser 1955 1960 1965 Arg Lys Val Glu Asn Gln Lys AlaIle Thr Thr Glu Arg Leu Leu Ser 1970 1975 1980 Gly Leu Arg Leu Tyr AsnSer Ala Thr Asp Gln Pro Glu Cys Tyr Lys 1985 1990 1995 2000 Ile Thr TyrPro Lys Pro Leu Tyr Ser Ser Ser Val Pro Ala Asn Tyr 2005 2010 2015 SerAsp Pro Gln Phe Ala Val Ala Val Cys Asn Asn Tyr Leu His Glu 2020 20252030 Asn Tyr Pro Thr Val Ala Ser Tyr Gln Ile Thr Asp Glu Tyr Asp Ala2035 2040 2045 Tyr Leu Asp Met Val Asp Gly Thr Val Ala Cys Leu Asp ThrAla Thr 2050 2055 2060 Phe Cys Pro Ala Lys Leu Arg Ser Tyr Pro Lys LysHis Glu Tyr Arg 2065 2070 2075 2080 Ala Pro Asn Ile Arg Ser Ala Val ProSer Ala Met Gln Asn Thr Leu 2085 2090 2095 Gln Asn Val Leu Ile Ala AlaThr Lys Arg Asn Cys Asn Val Thr Gln 2100 2105 2110 Met Arg Glu Leu ProThr Leu Asp Ser Ala Thr Phe Asn Val Glu Cys 2115 2120 2125 Phe Arg LysTyr Ala Cys Asn Asp Glu Tyr Trp Glu Glu Phe Ala Arg 2130 2135 2140 LysPro Ile Arg Ile Thr Thr Glu Phe Val Thr Ala Tyr Val Ala Arg 2145 21502155 2160 Leu Lys Gly Pro Lys Ala Ala Ala Leu Phe Ala Lys Thr Tyr AsnLeu 2165 2170 2175 Val Pro Leu Gln Glu Val Pro Met Asp Arg Phe Val MetAsp Met Lys 2180 2185 2190 Arg Asp Val Lys Val Thr Pro Gly Thr Lys HisThr Glu Glu Arg Pro 2195 2200 2205 Lys Val Gln Val Ile Gln Ala Ala GluPro Leu Ala Thr Ala Tyr Leu 2210 2215 2220 Cys Gly Ile His Arg Glu LeuVal Arg Arg Leu Thr Ala Val Leu Leu 2225 2230 2235 2240 Pro Asn Ile HisThr Leu Phe Asp Met Ser Ala Glu Asp Phe Asp Ala 2245 2250 2255 Ile IleAla Glu His Phe Lys Gln Gly Asp Pro Val Leu Glu Thr Asp 2260 2265 2270Ile Ala Ser Phe Asp Lys Ser Gln Asp Asp Ala Met Ala Leu Thr Gly 22752280 2285 Leu Met Ile Leu Glu Asp Leu Gly Val Asp Gln Pro Leu Leu AspLeu 2290 2295 2300 Ile Glu Cys Ala Phe Gly Glu Ile Ser Ser Thr His LeuPro Thr Gly 2305 2310 2315 2320 Thr Arg Phe Lys Phe Gly Ala Met Met LysSer Gly Met Phe Leu Thr 2325 2330 2335 Leu Phe Val Asn Thr Val Leu AsnVal Val Ile Ala Ser Arg Val Leu 2340 2345 2350 Glu Glu Arg Leu Lys ThrSer Arg Cys Ala Ala Phe Ile Gly Asp Asp 2355 2360 2365 Asn Ile Ile HisGly Val Val Ser Asp Lys Glu Met Ala Glu Arg Cys 2370 2375 2380 Ala ThrTrp Leu Asn Met Glu Val Lys Ile Ile Asp Ala Val Ile Gly 2385 2390 23952400 Glu Arg Pro Pro Tyr Phe Cys Gly Gly Phe Ile Leu Gln Asp Ser Val2405 2410 2415 Thr Ser Thr Ala Cys Arg Val Ala Asp Pro Leu Lys Arg LeuPhe Lys 2420 2425 2430 Leu Gly Lys Pro Leu Pro Ala Asp Asp Glu Gln AspGlu Asp Arg Arg 2435 2440 2445 Arg Ala Leu Leu Asp Glu Thr Lys Ala TrpPhe Arg Val Gly Ile Thr 2450 2455 2460 Gly Thr Leu Ala Val Ala Val ThrThr Arg Tyr Glu Val Asp Asn Ile 2465 2470 2475 2480 Thr Pro Val Leu LeuAla Leu Arg Thr Phe Ala Gln Ser Lys Arg Ala 2485 2490 2495 Phe Gln AlaIle Arg Gly Glu Ile Lys His Leu Tyr Gly Gly Pro Lys 2500 2505 2510 1245amino acids amino acid linear protein 10 Met Asn Arg Gly Phe Phe Asn MetLeu Gly Arg Arg Pro Phe Pro Ala 1 5 10 15 Pro Thr Ala Met Trp Arg ProArg Arg Arg Arg Gln Ala Ala Pro Met 20 25 30 Pro Ala Arg Asn Gly Leu AlaSer Gln Ile Gln Gln Leu Thr Thr Ala 35 40 45 Val Ser Ala Leu Val Ile GlyGln Ala Thr Arg Pro Gln Pro Pro Arg 50 55 60 Pro Arg Pro Pro Pro Arg GlnLys Lys Gln Ala Pro Lys Gln Pro Pro 65 70 75 80 Lys Pro Lys Lys Pro LysThr Gln Glu Lys Lys Lys Lys Gln Pro Ala 85 90 95 Lys Pro Lys Pro Gly LysArg Gln Arg Met Ala Leu Lys Leu Glu Ala 100 105 110 Asp Arg Leu Phe AspVal Lys Asn Glu Asp Gly Asp Val Ile Gly His 115 120 125 Ala Leu Ala MetGlu Gly Lys Val Met Lys Pro Leu His Val Lys Gly 130 135 140 Thr Ile AspHis Pro Val Leu Ser Lys Leu Lys Phe Thr Lys Ser Ser 145 150 155 160 AlaTyr Asp Met Glu Phe Ala Gln Leu Pro Val Asn Met Arg Ser Glu 165 170 175Ala Phe Thr Tyr Thr Ser Glu His Pro Glu Gly Phe Tyr Asn Trp His 180 185190 His Gly Ala Val Gln Tyr Ser Gly Gly Arg Phe Thr Ile Pro Arg Gly 195200 205 Val Gly Gly Arg Gly Asp Ser Gly Arg Pro Ile Met Asp Asn Ser Gly210 215 220 Arg Val Val Ala Ile Val Leu Gly Gly Ala Asp Glu Gly Thr ArgThr 225 230 235 240 Ala Leu Ser Val Val Thr Trp Asn Ser Lys Gly Lys ThrIle Lys Thr 245 250 255 Thr Pro Glu Gly Thr Glu Glu Trp Ser Ala Ala ProLeu Val Thr Ala 260 265 270 Met Cys Leu Leu Gly Asn Val Ser Phe Pro CysAsp Arg Pro Pro Thr 275 280 285 Cys Tyr Thr Arg Glu Pro Ser Arg Ala LeuAsp Ile Leu Glu Glu Asn 290 295 300 Val Asn His Glu Ala Tyr Asp Thr LeuLeu Asn Ala Ile Leu Arg Cys 305 310 315 320 Gly Ser Ser Gly Arg Ser LysArg Ser Val Thr Asp Asp Phe Thr Leu 325 330 335 Thr Ser Pro Tyr Leu GlyThr Cys Ser Tyr Cys His His Thr Glu Pro 340 345 350 Cys Phe Ser Pro ValLys Ile Glu Gln Val Trp Asp Glu Ala Asp Asp 355 360 365 Asn Thr Ile ArgIle Gln Thr Ser Ala Gln Phe Gly Tyr Asp Gln Ser 370 375 380 Gly Ala AlaSer Ala Asn Lys Tyr Arg Tyr Met Ser Leu Glu Gln Asp 385 390 395 400 HisThr Val Lys Glu Gly Thr Met Asp Asp Ile Lys Ile Ser Thr Ser 405 410 415Gly Pro Cys Arg Arg Leu Ser Tyr Lys Gly Tyr Phe Leu Leu Ala Lys 420 425430 Cys Pro Pro Gly Asp Ser Val Thr Val Ser Ile Val Ser Ser Asn Ser 435440 445 Ala Thr Ser Cys Thr Leu Ala Arg Lys Ile Lys Pro Lys Phe Val Gly450 455 460 Arg Glu Lys Tyr Asp Leu Pro Pro Val His Gly Lys Lys Ile ProCys 465 470 475 480 Thr Val Tyr Asp Arg Leu Lys Glu Thr Thr Ala Gly TyrIle Thr Met 485 490 495 His Arg Pro Gly Pro His Ala Tyr Thr Ser Tyr LeuGlu Glu Ser Ser 500 505 510 Gly Lys Val Tyr Ala Lys Pro Pro Ser Gly LysAsn Ile Thr Tyr Glu 515 520 525 Cys Lys Cys Gly Asp Tyr Lys Thr Gly ThrVal Ser Thr Arg Thr Glu 530 535 540 Ile Thr Gly Cys Thr Ala Ile Lys GlnCys Val Ala Tyr Lys Ser Asp 545 550 555 560 Gln Thr Lys Trp Val Phe AsnSer Pro Asp Leu Ile Arg His Asp Asp 565 570 575 His Thr Ala Gln Gly LysLeu His Leu Pro Phe Lys Leu Ile Pro Ser 580 585 590 Thr Cys Met Val ProVal Ala His Ala Pro Asn Val Ile His Gly Phe 595 600 605 Lys His Ile SerLeu Gln Leu Asp Thr Asp His Leu Thr Leu Leu Thr 610 615 620 Thr Arg ArgLeu Gly Ala Asn Pro Glu Pro Thr Thr Glu Trp Ile Val 625 630 635 640 GlyLys Thr Val Arg Asn Phe Thr Val Asp Arg Asp Gly Leu Glu Tyr 645 650 655Ile Trp Gly Asn His Glu Pro Val Arg Val Tyr Ala Gln Glu Ser Ala 660 665670 Pro Gly Asp Pro His Gly Trp Pro His Glu Ile Val Gln His Tyr Tyr 675680 685 His Arg His Pro Val Tyr Thr Ile Leu Ala Val Ala Ser Ala Thr Val690 695 700 Ala Met Met Ile Gly Val Thr Val Ala Val Leu Cys Ala Cys LysAla 705 710 715 720 Arg Arg Glu Cys Leu Thr Pro Tyr Ala Leu Ala Pro AsnAla Val Ile 725 730 735 Pro Thr Ser Leu Ala Leu Leu Cys Cys Val Arg SerAla Asn Ala Glu 740 745 750 Thr Phe Thr Glu Thr Met Ser Tyr Leu Trp SerAsn Ser Gln Pro Phe 755 760 765 Phe Trp Val Gln Leu Cys Ile Pro Leu AlaAla Phe Ile Val Leu Met 770 775 780 Arg Cys Cys Ser Cys Cys Leu Pro PheLeu Val Val Ala Gly Ala Tyr 785 790 795 800 Leu Ala Lys Val Asp Ala TyrGlu His Ala Thr Thr Val Pro Asn Val 805 810 815 Pro Gln Ile Pro Tyr LysAla Leu Val Glu Arg Ala Gly Tyr Ala Pro 820 825 830 Leu Asn Leu Glu IleThr Val Met Ser Ser Glu Val Leu Pro Ser Thr 835 840 845 Asn Gln Glu TyrIle Thr Cys Lys Phe Thr Thr Val Val Pro Ser Pro 850 855 860 Lys Ile LysCys Cys Gly Ser Leu Glu Cys Gln Pro Ala Ala His Ala 865 870 875 880 AspTyr Thr Cys Lys Val Phe Gly Gly Val Tyr Pro Phe Met Trp Gly 885 890 895Gly Ala Gln Cys Phe Cys Asp Ser Glu Asn Ser Gln Met Ser Glu Ala 900 905910 Tyr Val Glu Leu Ser Ala Asp Cys Ala Ser Asp His Ala Gln Ala Ile 915920 925 Lys Val His Thr Ala Ala Met Lys Val Gly Leu Arg Ile Val Tyr Gly930 935 940 Asn Thr Thr Ser Phe Leu Asp Val Tyr Val Asn Gly Val Thr ProGly 945 950 955 960 Thr Ser Lys Asp Leu Lys Val Ile Ala Gly Pro Ile SerAla Ser Phe 965 970 975 Thr Pro Phe Asp His Lys Val Val Ile His Arg GlyLeu Val Tyr Asn 980 985 990 Tyr Asp Phe Pro Glu Tyr Gly Ala Met Lys ProGly Ala Phe Gly Asp 995 1000 1005 Ile Gln Ala Thr Ser Leu Thr Ser LysAsp Leu Ile Ala Ser Thr Asp 1010 1015 1020 Ile Arg Leu Leu Lys Pro SerAla Lys Asn Val His Val Pro Tyr Thr 1025 1030 1035 1040 Gln Ala Ala SerGly Phe Glu Met Trp Lys Asn Asn Ser Gly Arg Pro 1045 1050 1055 Leu GlnGlu Thr Ala Pro Phe Gly Cys Lys Ile Ala Val Asn Pro Leu 1060 1065 1070Arg Ala Val Asp Cys Ser Tyr Gly Asn Ile Pro Ile Ser Ile Asp Ile 10751080 1085 Pro Asn Ala Ala Phe Ile Arg Thr Ser Asp Ala Pro Leu Val SerThr 1090 1095 1100 Val Lys Cys Glu Val Ser Glu Cys Thr Tyr Ser Ala AspPhe Gly Gly 1105 1110 1115 1120 Met Ala Thr Leu Gln Tyr Val Ser Asp ArgGlu Gly Gln Cys Pro Val 1125 1130 1135 His Ser His Ser Ser Thr Ala ThrLeu Gln Glu Ser Thr Val His Val 1140 1145 1150 Leu Glu Lys Gly Ala ValThr Val His Phe Ser Thr Ala Ser Pro Gln 1155 1160 1165 Ala Asn Phe IleVal Ser Leu Cys Gly Lys Lys Thr Thr Cys Asn Ala 1170 1175 1180 Glu CysLys Pro Pro Ala Asp His Ile Val Ser Thr Pro His Lys Asn 1185 1190 11951200 Asp Gln Glu Phe Gln Ala Ala Ile Ser Lys Thr Ser Trp Ser Trp Leu1205 1210 1215 Phe Ala Leu Phe Gly Gly Ala Ser Ser Leu Leu Ile Ile GlyLeu Met 1220 1225 1230 Ile Phe Ala Cys Ser Met Met Leu Thr Ser Thr ArgArg 1235 1240 1245 20 base pairs nucleic acid single linear othernucleic acid /desc = “oligonucleotide” 11 CTGCGGCGGA TTCATCTTGC 20 14base pairs nucleic acid single linear other nucleic acid /desc =“oligonucleotide” 12 CTCCAACTTA AGTG 14

That which is claimed is:
 1. A method of introducing and expressingheterologous RNA in bone marrow cells, comprising: (a) providing arecombinant alphavirus, said alphavirus containing a replicon RNAcomprising a heterologous RNA to be expressed in said bone marrow cells;and (b) contacting said recombinant alphavirus to said bone marrow cellsso that said heterologous RNA is introduced and expressed therein. 2.The method of claim 1, wherein the structural proteins of saidalphavirus are South African Arbovirus No. 86 (S.A.AR86) structuralproteins.
 3. The method of claim 1, wherein the structural proteins ofsaid alphavirus are Girdwood S.A. structural proteins.
 4. The method ofclaim 1, wherein the structural proteins of said alphavirus are TR339structural proteins.
 5. The method of claim 1, wherein the structuralproteins of said alphavirus are Sindbis virus structural proteins.
 6. Amethod of introducing and expressing heterologous RNA in bone marrowcells, comprising: (a) providing a recombinant South African ArbovirusNo. 86 (S.A.AR86), said S.A.AR86 containing a heterologous RNA to beexpressed in said bone marrow cells; and (b) contacting said recombinantS.A.AR86 to said bone marrow cells so that said heterologous RNA isintroduced and expressed therein.
 7. The method of claim 6, wherein saidbone marrow cells are selected from the group consisting ofpolymorphonuclear cells, hematopoietic stem cells, erythrocytes,macrophages, fibroblasts, osteoprogenitor cells, osteoblasts,osteoclasts, marrow stromal cells, and chondrocytes.
 8. The method ofclaim 7, wherein said bone marrow cells are osteoblasts.
 9. The methodof claim 6, wherein said bone marrow cells are in a synovial joint. 10.The method of claim 6, wherein said bone marrow cells are endosteumcells.
 11. The method of claim 6, wherein said bone marrow cells areendosteum cells of a synovial joint.
 12. The method of claim 11, whereinsaid bone marrow cells are osteoblasts.
 13. The method according toclaim 6, wherein said contacting step is carried out in vitro.
 14. Themethod according to claim 13, further comprising the step ofadministering the bone marrow cells to a subject in need thereof. 15.The method according to claim 6, wherein said contacting step is carriedout in vivo in a subject in need of such treatment.
 16. The methodaccording to claim 15, wherein said S.A.AR86 is administered by aparenteral route.
 17. The method according to claim 16, wherein saidS.A.AR86 is administered by a method selected from the group consistingof subcutaneous, intracerebral, intradermal, intramuscular, intravenousand intraarticular administration.
 18. The method according to claim 6,wherein said heterologous RNA encodes a protein or peptide.
 19. Themethod according to claim 18, wherein said heterologous RNA encodes animmunogenic protein or peptide.
 20. The method according to claim 19,wherein said immunogenic protein or peptide is a viral antigen.
 21. Themethod according to claim 20, wherein said immunogenic protein orpeptide is selected from the group consisting of an influenza immunogen,an orthomyxovirus immunogen, a lentivirus immunogen, an equineinfectious anemia virus immunogen, a simian immunodeficiency virusimmunogen, a human immunodeficiency virus immunogen, a Lassa fever virusimmunogen, an arenavirus immunogen, a vaccinia virus immunogen, apoxvirus immunogen, a yellow fever virus immunogen, a Japaneseencephalitis virus immunogen, a flavivirus immunogen, an Ebola virusimmunogen, a Marburg virus immunogen, a filovirus immunogen, abunyavirus immunogen, a Rift Valley Fever immunogen, a Congo-Crimeanhemorrhagic fever virus immunogen, a Sandfly fever Sicilian virusimmunogen, and a coronavirus immunogen.
 22. The method according toclaim 20, wherein said immunogenic protein or peptide is a humanimmunodeficiency virus immunogen.
 23. The method according to claim 19,wherein said S.A.AR86 virus contains a heterologous RNA segment, saidheterologous RNA segment comprising a promoter operable in said bonemarrow cells operatively associated with said heterologous RNA.
 24. Themethod of claim 23, wherein said promoter is a S.A.AR86 26S promoter.25. The method according to claim 18, wherein said heterologous RNAencodes a therapeutic protein or peptide.
 26. The method according toclaim 25, wherein said protein or peptide is selected from the groupconsisting of hormones, growth factors, interleukins, cytokines,chemokines, and enzymes.
 27. The method according to claim 6, whereinsaid heterologous RNA encodes an antisense oligonucleotide or aribozyme.
 28. The method according to claim 6, wherein said S.A.AR86contains a replicon RNA comprising the heterologous RNA.
 29. The methodof claim 6, wherein said S.A.AR86 comprises one or more attenuatingmutations.
 30. A method of introducing and expressing heterologous RNAin bone marrow cells, comprising: (a) providing a recombinant GirdwoodS.A. virus, said Girdwood S.A. virus containing a heterologous RNA to beexpressed in said bone marrow cells; and (b) contacting said recombinantGirdwood S.A. virus to said bone marrow cells so that said heterologousRNA is introduced and expressed therein.
 31. The method of claim 30,wherein said bone marrow cells are selected from the group consisting ofpolymorphonuclear cells, hematopoietic stem cells, erythrocytes,macrophages, fibroblasts, osteoprogenitor cells, osteoblasts,osteoclasts, marrow stromal cells, and chondrocytes.
 32. The methodaccording to claim 30, wherein said contacting step is carried out invitro.
 33. The method according to claim 32, further comprising the stepof administering the bone marrow cells to a subject in need thereof. 34.The method according to claim 30, wherein said heterologous RNA encodesan immunogenic protein or peptide.
 35. The method according to claim 30,wherein said heterologous RNA encodes a therapeutic protein or peptide.36. The method according to claim 30, wherein said heterologous RNAencodes an antisense oligonucleotide or a ribozyme.
 37. The methodaccording to claim 30, wherein said Girdwood S.A. contains a repliconRNA comprising the heterologous RNA.
 38. A method of introducing andexpressing heterologous RNA in bone marrow cells, comprising: (a)providing a recombinant Sindbis virus, said Sindbis virus containing aheterologous RNA to be expressed in said bone marrow cells; and (b)contacting said recombinant Sindbis virus to said bone marrow cells sothat said heterologous RNA is introduced and expressed therein.
 39. Themethod of claim 38, wherein said bone marrow cells are selected from thegroup consisting of polymorphonuclear cells, hematopoietic stem cells,erythrocytes, macrophages, fibroblasts, osteoprogenitor cells,osteoblasts, osteoclasts, marrow stromal cells, and chondrocytes. 40.The method according to claim 38, wherein said contacting step iscarried out in vitro.
 41. The method according to claim 40, furthercomprising the step of administering the bone marrow cells to a subjectin need thereof.
 42. The method according to claim 38, wherein saidheterologous RNA encodes an immunogenic protein or peptide.
 43. Themethod according to claim 38, wherein said heterologous RNA encodes atherapeutic protein or peptide.
 44. The method according to claim 38,wherein said heterologous RNA encodes an antisense oligonucleotide or aribozyme.
 45. The method according to claim 38, wherein said Sindbisvirus contains a replicon RNA comprising the heterologous RNA.
 46. Amethod of introducing and expressing heterologous RNA in bone marrowcells, comprising: (a) providing a recombinant Sindbis strain TR339virus, said Sindbis strain TR339 virus containing a heterologous RNA tobe expressed in said bone marrow cells; and (b) contacting saidrecombinant Sindbis strain TR339 virus to said bone marrow cells so thatsaid heterologous RNA is introduced and expressed therein.
 47. Themethod of claim 46, wherein said bone marrow cells are selected from thegroup consisting of polymorphonuclear cells, hematopoietic stem cells,erythrocytes, macrophages, fibroblasts, osteoprogenitor cells,osteoblasts, osteoclasts, marrow stromal cells, and chondrocytes. 48.The method according to claim 46, wherein said contacting step iscarried out in vitro.
 49. The method according to claim 48, furthercomprising the step of administering the bone marrow cells to a subjectin need thereof.
 50. The method according to claim 46, wherein saidheterologous RNA encodes an immunogenic protein or peptide.
 51. Themethod according to claim 46, wherein said heterologous RNA encodes atherapeutic protein or peptide.
 52. The method according to claim 46,wherein said heterologous RNA encodes an antisense oligonucleotide or aribozyme.
 53. The method according to claim 46, wherein said Sindbisstrain TR339 virus contains a replicon RNA comprising the heterologousRNA.